It will often be necessary to take unrefined data, or data provided in a format incompatible with the analytical software you are using, and format it properly for use. If you are working with just a small number of records, you might do this by hand. But if you are dealing with large numbers of records, you want to do this as a batch formatting.
Batch Merging Cells
1. Imagine you have four columns of data consisting of rough locational data, arranged as follows:
But for your needs, you would like to have all four column entries in one single entry, separated by spaces and commas, and in the order of number, street, city, state. To do this, create a new column (e.g. "Location") and enter the following equation in the second row (the first row after the column headings):
=C2&" "&B2&", "&A2&", "&D2
The quotation marks and the characters enclosed by them (here, spaces and commas) will show up in the new cell entry between the values in cells C2, B2, A2, and D2. Note that there is a space keyed in between the quotation marks in the first instance, and a comma followed by a space in the next two. The result will be therefore be the entry in cell C2 followed by a space, then the entry in B2 followed by a comma and space, and so on:
2. To complete the merge, or concatenation, for every row, create a box around the entries in the location column starting with the first entry (not the column header row) and drag down to the last entry. Hit Ctrl + D to fill in the same formula for the entire column (Excel automatically takes into account the advancing row numbers, and will insert C3, B3, etc.):
Batch Splitting Entries in Cells
1. Excel also has a pretty useful way to split up cell entries so that you have each piece of information in a separate column. This can be extremely helpful in cases where the original data was entered so that discrete values, which could easily serve as independently analyzed variables, were thrown together in one column. Consider the following data in rough, unrefined form:
Among the problems from the viewpoint of having standardized data presented here are the way wages are given in three different formats, and that the "occupants" are sometimes listed with more than one entry per cell. Let's say that you want the occupants to be listed separately so that you can sort them separately as well as make easy counts of the number of men and women. To do this, you need to split the occupants column and re-enter them in separate fields. Select the column that you wish to split by clicking on its alphabetic letter at the top of the sheet. With the column now highlighted, select from the top main menu Data > Text to Columns. The following conversion wizard option will appear:
2. Excel will detect what it believes has been used to separate text in your column, almost always spaces, commas, or a combination of both. In this case text is separated within cells by spaces. Select "Delimited" and hit Next.
3. On the next menu, you have the option to confirm that it is spaces that are separating the names in your column. Check the space box and uncheck the tab box. Below, you should see Excel nicely line up each piece of text in your cells separate in a preview column:
4. Continuing to the last step, Excel gives you the option to assign a format to each newly created column (General, Text, Date) or an option to omit importing the column. It is sufficient to leave the column format as general, since exporting these Excel tables as CSV files entails losing some of that internal Excel organization of data anyway. Here, it is a good idea to go ahead and omit now-superfluous entries like the "&", "and", "And" conjunctions originally entered.
5. By default, in the "Destination" option, Excel will put the first of the split cells into the column you've selected, and paste remaining new split cells onto subsequent columns, even if you have other data there. To avoid this, select the destination column by clicking on the selection box in destination and creating a box around the empty column destinations. For this example, four new columns are needed (one for the first and last names of two different individuals for a total of four):
6. Once you have the destination columns selected, X-close out the ribbon menu bar to return to the wizard. Click Finish. You will now have a new set of columns with split entries. The columns can then be renamed to reflect the entries in each column: