Bulk API

Document Sample
Bulk API Powered By Docstoc
					Bulk Data API

 Nick Simha
 Technical Alliance Manager

  Bulk Data API basics
  Demo
  Best practices
  Resource list
What is the Bulk Data API?

  REST based, asynchronous API optimized for loading
   large sets of data.
Why is it useful?

  Enable high volume integration with Salesforce (volume)
  Enable integration that has to finish in a certain window of time
  Part of suite of features that enable our customers to store very
   large data volumes in Salesforce
     – Batch Apex
     – Skinny tables
     – Divisions
     – Custom Indexing
     – Etc.
How does the Bulk Data API works
How can I call the Bulk Data API?

  Through Data Loader
  From any Web Services Client – Java, C# etc.
  From the command line!
  Support by our integration partners
Using the Bulk API from a client

  Create Job
  Create Batch (es) and add to Job
    – Number of batches determined by the amount of data and the
      limits on batch size.
  Close Job
  Retrieve Batch Status
  Retrieve Batch Result
Sample Request

 POST /services/async/17.0/job HTTP/1.1
 User-Agent: curl/7.19.6 (i386-pc-win32) libcurl/7.19.6 OpenSSL/09.8k zlib/1.2.3
 Host: na6.salesforce.com
 Accept: */*

 Content-Type: application/xml; charset=UTF-8
 Content-Length: 195
 <?xml version="1.0" encoding="UTF-8"?>
 <jobInfo xmlns="http://wwwforce.com/2009/06/asyncapi/dataload">

 See Getting Started Chapter in the API guide. Use curl –trace-ascii
 <file_name> to capture messages.

  Load 100K Addresses of medical providers
  Cleanse the data
Bulk API - Some Additional Information

  Can Monitor Bulk Loads in Builder
    – Monitoring -> Bulk Data Load Jobs
  Doesn’t handle attachments
  Governor limits – 500 batches per 24 hour limit. 10,000
   records per batch. So theoretical limit of 5M records per
    – Caveat – Batch size also needs to be less than 10MB
  Batch limits can be increased by engineering
    – Need to contact your SE
Best practices

  Combine Bulk API with Batch Apex to get optimal performance
     – Faster than complex triggers
     – Similar to the demo – this is a generic pattern that you can use in
       many scenarios
  Stick with parallel processing unless there is a reason not to do so
     – See FAQ for scenarios when you would serial processing
  Handling Very Large Data Volumes requires a comprehensive,
   holistic approach
     – Bulk API is one part of the solution
Resource list

  Bulk API doc

Shared By: