Loading QWI Data from the U.S. Census Bureau into Hadoop

Step-by-step instructions on how to download U.S. Census Burearu Quarterly Workforce Indicator data files into a Hadoop cluster.

Tyler Chessman

October 16, 2014

2 Min Read
Loading QWI Data from the U.S. Census Bureau into Hadoop

Quarterly Workforce Indicators (QWI) data can be downloaded from the U.S. Census Bureau, as shown in Figure 1.

My example uses files representing a state level summary of private workforce data by employee sex and age, firm size, and industry group. The direct links for Texas, California, and Nebraska are here:

Note that there are additional, smaller files that describe the various age, firm, and industry group categories. These files could also be downloaded and inserted into Hadoop to represent additional tables. In my example, I simply downloaded these additional files directly into an Excel PowerPivot workbook.

 logo in a gray background |

Once you have the three .gz files downloaded, you need to get them into your Hadoop cluster. For HDInsight, you'll want to upload the files to an Azure Blob Container within the storage account associated with the cluster. I used a free tool from codeplex, the Azure Storage Explorer, to upload the files (see Figure 2). In a production environment, you would likely use the Azure Storage APIs and/or Power Shell.

 logo in a gray background |

Using HDP Sandbox

If you are using the HDP Sandbox, you can use the Hadoop command line interface—or you can upload files by using Hue—an included Web interface for Hadoop (Note: Hue is not available for a HDP installation on Windows). Figure 3 shows Hue, accessed from my host machine's browser (the sandbox is running as a guest VM).

 logo in a gray background |

Installed HDP on Windows OS

If you've chosen to install HDP on a Windows operating system, you can use the Hadoop command line to load files into a folder. Figure 4 shows the steps needed to create a folder and then upload the three files.

 logo in a gray background |

Main article: Integrating Hadoop with SQL Server

Sign up for the ITPro Today newsletter
Stay on top of the IT universe with commentary, news analysis, how-to's, and tips delivered to your inbox daily.

You May Also Like