At Third Sector IT we’re often asked to migrate data from a client’s existing system to a new Salesforce.com org.
Sometimes this is because they are changing databases, and need all of their old data in Salesforce. Other times, they are considering adopting Salesforce and want to use the system with their own data in it. Sometimes customers have a second databases or web service, and want a daily or near-live synchronisation between the two.
If the source is something simple like an Excel spread this can often be achieved using spreadsheets, the Salesforce Data Loader and a lot of patience!
When migrating from a more complex source with multiple relational tables such as Access, MySQL, or MS SQL Server, using data loader can be time consuming and error prone. This is where Talend comes to the rescue! At Third Sector IT we love Talend and use it for virtually all of our complex data migration work both in house and for clients. Here are just a few of its many advantages for this type of work:
- Once a Talend job is set-up you can run it as many times as you like and each run takes a matter of minutes. This allows an iterative and flexible approach to data migration projects.
- Talend comes with built-in connectors for pretty much any data source you can think of (including SQL Server.)
- Talend natively supports the Salesforce.com Bulk Data API.
- Talend is open source and completely free!
To expand on the first bullet point, sometimes a ‘single’ migration will take dozens of migrations. A recent project we did involving an off-the-shelf database had over 50 tables and dirty data within it. After each migration, we would check with the client, ask them to clean the data based on errors, and add a few more tables and fields to the migration.
Jeff Douglas wrote a great blog post on the basics of migrating data from a .csv file to Salesforce. Jeff’s post really helped me and it’s a great place to start if you’re new to Talend.
My blog post today is the first in a series I’ll be writing on using Talend to migrate data to Salesforce from SQL Server. This method has a couple of advantages over using .csv files:
- You won’t have to prepare a new .csv file every time you want to do an upload of data. Talend will pull data directly from your SQL tables and views.
- You can automate the upload, so it runs nightly or even hourly
- Talend can retrieve schemas for objects in your SQL database and keep them in sync with your Talend job.
When the series is finished your Talend job will look something like the one below (click to enlarge images). In this post I’ll cover the first section highlighted in red, creating a connection and retrieving the schema of a SQL Server data source.
Connecting to SQL Server
Create a new Talend job and expand the Metada menu in the left sidebar. Right-click Db Connections and select Create connection.
Give your connection a helpful name, fill in your SQL Server credentials and hit Check. If everything’s ok click finish to create your database connection.
Retrieving the Schema
The next step is to retrieve a schema for each of the SQL objects you want to import from. Handily Talend can do this for you. Right click your Db Connection and select Retrieve Schema.
Select which types of objects you want to work with (tables, views and synonyms are supported). On the next screen select the objects you want to import from. For the purposes of this demo I’m going to select a single table. If you’re working on a more complex migration you can select as many objects as you like.
Click the Retrieve Schema button to, you guessed it, retrieve the schema! Talend will pull in all the fields, keys, data types etc for the objects you selected.
Note – You may have to modify data types here. Salesforce doesn’t allow all of the data types available in SQL e.g. changing Big Integer to Integer.
You should now have a connection to your SQL database and a schema for each object you want to import from. To get this into your Talend job, drag it from the left sidebar onto the pallet. When you do this you’ll be asked what type of component you want to create, select tMSSQLInput. You should now have a SQL input component that looks something like this.
This input isn’t connected to anything at the moment. In the next blog post I’ll cover how to filter your data and map it onto the fields of a Salesforce.com object. A future post will also cover uploading the data to Salesforce via the Bulk API.