Tech4Him – Technology with Integrity

A Christian technology chaos wrangler and his thoughts

SSWUG vConf – How to Avoid Common Pitfalls with SSIS

Posted by Tom On April - 24 - 2009

2479409190_446536d89b

Presenter: Matt Masson
http://blogs.msdn.com/mattm/

Starting up a new project with SQL Server Integration Services? Save time and money by learning about how to avoid common issues that users hit when implementing their ETL solutions. Drawing on the most frequently asked questions in the MSDN Integration Services forum and some of the more interesting packages we’ve seen over the years, well highlight common mistakes, clear up some misconceptions, and provide best practices to help you efficiently and effectively use Integration Services.

How Does the Data Flow Work?

  • In-memory ETL engine
    • Really ELT rather than ETL
  • Metadata set at design time, not runtime
    • Allows pre-calculations of sizes, etc…
  • Data is moved through the pipeline in buffers

Chose the Right Transforms

  • Synchronous
    • 1 output row for every input row
  • Asynchronous
    • Any changes to the buffer shape. Like a Union ALL
  • Beware of blocking transforms like Sort and Aggregate
  • Take advantage of your database engine
    • ORDER BY …
    • SELECT INTO …

Don’t do lots of sorts in SSIS, do them in the Source DB input. The DB server is much more efficient at this.

What are Configurations?

  • Allow you to externalize settings
  • Change behavior without modifying the package
  • Multiple configuration sources
    • XML and SQL Server are the most common

Configurations can be added after project creation. Launch the configuration wizard. These are design time configurations.

Using /SET (Used with Runtime configurations)

  • DTEXEC /SET
  • Override values that don’t already have a configuration set
  • Set connection strings
  • Pass arguments to a package

Why Can’t I Share My Package?

  • Package protection level
    • Encrypt {all|sensitive} with User Key
    • Encrypt {all|sensitive} with Password
    • Don’t Save Sensitive
    • Server Storage

How Do I Deploy My Packages?

Deployment – File System

  • Deploy files using an MSI, or copied to a share
  • XML configuration

Deployment – SQL Server (*Recommended*)

  • Files are stored in MSDB
  • Execution typically done with SQL Agent
    • Make use of Proxy Accounts
  • Store configurations in SQL Server

Deployment – Package Store

  • Deploys to the SSIS service
  • Front end to File and SQL storage

What is the SSIS Service For?

  • Not required for execution
    • Tracks running packages
    • Component cache
  • Can manage multiple File or SQL locations
  • Remote connections to the service will require DCOM security changes

How should I do Logging?

  • Logging has to be enabled on the package
  • Multiple logging destinations
    • SQL Server is most common
  • Custom logging solutions

Package Execution Reports

  • Available from Microsoft as a download for Reporting Services (SSRS)

How do I Handle Errors?

  • Control Flow
    • Precedence constraints
    • Event Handlers
    • Events “bubble up” to parent
  • Data Flow
    • Redirect error output

Using Expressions

  • Set package properties
  • Variable values
  • Derived Column transform
  • Conditional Split transform

Data Flow Expressions

  • 32bit vs. 64bit
    • SSIS will install both 64bit and 32bit runtimes
    • BIDS is 32bit only
      • Only 32bit drivers will appear at design time
      • Use project setting for 64bit execution
    • Use configurations to switch providers
  • Other Stuff
    • There are no MS 64bit Excel drivers
      • Use 32bit or 3rdparty
    • SSIS is an extensible platform

Resources


Popularity: 11% [?]

Share and Enjoy:
  • Print
  • Digg
  • del.icio.us
  • Facebook
  • Google Bookmarks
  • email
  • LinkedIn
  • PDF
  • RSS
  • Slashdot
  • StumbleUpon
  • Technorati
  • Twitter

Leave a Reply