As consultants who deal with association data on a daily basis, the Bear Analytics team regularly extolls the power of data and encourages clients to do more with their data. Sounds great, right? But when you’re the one staring down an Excel spreadsheet full of dirty or unformatted data, it’s a different story.
Dealing with data that comes from multiple sources and arrives in different formats can be tricky, time-consuming, and overwhelming. Here are some of our favorite Excel functions we use on a daily basis that can make your data-wrangling easier.
1. Remove Duplicates
This time-saving function lives within Excel’s “data” menu tab. Imagine a scenario where you have been collecting prospect emails via whitepaper downloads on your website and you need to determine the number of leads you have obtained over the past 2 months. Since it’s feasible that some people will have downloaded more than one whitepaper, you know that there are duplicate email addresses in your data set.
Instead of alphabetizing and then eyeballing your list to spot these dupes and then deleting them manually, you can simply highlight the email address row and click “remove duplicates.” This will strip out all repeat instances of an email address, leaving you with the number of unique prospects you received through the whitepaper channel.
While this is an incredibly useful function, it does have its limitations. Removing duplicates will only eliminate EXACT matches. If someone misspells their email address or forgets to write the .com the second time they sign up, this function will still count each instance as unique.
Association data can come from many different systems, each with their own reporting standards and formats. Have you ever tried to combine a list that reports first and last names as separate cells with one that displays names in a single column? This small difference in format makes it impossible to merge the datasets and perform additional analysis in Excel.
You can use the concatenate formula to connect separate cells of data and display the values in a single cell. Create a new column in the spreadsheet and Type =CONCATENATE(B2,” “,A2), where B2 is the cell containing the last name, and A2 is the cell containing the first name. The resulting value will be a new single column of (last name, first name) data with a comma and space in between—identical to the (Smith, John) name formatting in your first list. Now that this name data is in the same format, you will be able to combine these lists and manipulate the data.
Professional Tip: If you are looking to connect first and last names into a single cell, you’ll need to keep the space between the two. In order to do that, you’ll want to include the spacing in the example above (denoted with the “ “ marks).
3. Count If
Let’s say you are looking through your list of current members and want to know the total number that hail from Ohio. Sure, you could scroll through all 4,000 members and manually count the number of people with “Ohio” listed in their state address field. But, thankfully, there is a better way!
Using the Count If formula enables you to quickly count all of the entries who have one variable—in this case State—in common. Type =CountIf into a new cell, and then tell the formula where to look and what to look for. Since we are looking for “Ohio” in the state field (let’s say it is in column H of your spreadsheet for example’s sake), the full formula would be =CountIf(H:H, “Ohio”). This powerful formula will count the number of members with Ohio listed as their state in a matter of seconds and will make your life much easier.
With these Excel functions in hand, you can save time formatting and cleaning your data and dive into deeper analysis—the fun stuff!