In this post, I explain how to create a PowerShell function to process CSV data. This allows you to reuse your code whenever you are working with CSV files in PowerShell

CSV (Comma-Separated Values) is used by almost every technology platform that we encounter. Manipulating this data can be cumbersome if you're NOT an Excel wizard, but PowerShell can simplify this job. For example, let's take a CSV with the following data:

We have been asked to modify this data and reorder where the data resides in the new CSV. We have been given an example of what the output should look like:

I always encourage people to create a function if they believe they will reuse it in the future, and I'm guessing we will be asked to provide this information again. First, I open the PowerShell ISE and then use the Crtl+J shortcut to pull up my Snippets. Next, I select Cmdlet (Advanced Function), which creates the outline for my new PowerShell function.

Next, we change the name of our function and add some parameters. Remember, you should always reference Get-Verb when deciding on a new function name.

<# .Synopsis Short description .DESCRIPTION Long description .EXAMPLE Example of how to use this cmdlet .EXAMPLE Another example of how to use this cmdlet #> function Convert-HRCSVData { [CmdletBinding()] [Alias()] Param ( # Provide a input csv path [Parameter(Mandatory=$false, ValueFromPipelineByPropertyName=$true, Position=0, ParameterSetName='ParamSet1')] [ValidateNotNull()] [ValidateNotNullOrEmpty()] [Alias("CSV")] $InFile= "C:\scripts\input\HR_input.csv", # Provide a output csv path [Parameter(Mandatory=$false, ValueFromPipelineByPropertyName=$true, Position=1, ParameterSetName='ParamSet1')] [ValidateNotNull()] [ValidateNotNullOrEmpty()] $OutFile = "c:\scripts\output\HR_output_$(Get-Date -f dd-MMM-yyyy_HHmmss).csv" ) Begin { } Process { } End { } } 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 < # . Synopsis Short description . DESCRIPTION Long description . EXAMPLE Example of how to use this cmdlet . EXAMPLE Another example of how to use this cmdlet #> function Convert - HRCSVData { [ CmdletBinding ( ) ] [ Alias ( ) ] Param ( # Provide a input csv path [ Parameter ( Mandatory = $ false , ValueFromPipelineByPropertyName = $ true , Position = 0 , ParameterSetName = 'ParamSet1' ) ] [ ValidateNotNull ( ) ] [ ValidateNotNullOrEmpty ( ) ] [ Alias ( "CSV" ) ] $ InFile = "C:\scripts\input\HR_input.csv" , # Provide a output csv path [ Parameter ( Mandatory = $ false , ValueFromPipelineByPropertyName = $ true , Position = 1 , ParameterSetName = 'ParamSet1' ) ] [ ValidateNotNull ( ) ] [ ValidateNotNullOrEmpty ( ) ] $ OutFile = "c:\scripts\output\HR_output_$(Get-Date -f dd-MMM-yyyy_HHmmss).csv" ) Begin { } Process { } End { } }

The first thing we should do is create an object to hold any changes that we make to our CSV data. This will allow us to manipulate all the data, and then we will be able to export it to a new CSV. I like to add this to the Begin block, even though in this case it is not needed since we are only processing one CSV at a time. The Begin block should look like this:

Begin { $ReturnObject = @() } Then, we should add functionality to import the CSV. In addition, we know that we will need to loop through each line in the CSV, so we should add this capability now. Begin { $ReturnObject = @() $DataIn = Import-Csv -Path $InFile } Process { foreach ($d in $DataIn) { } } 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 Begin { $ ReturnObject = @ ( ) } Then , we should add functionality to import the CSV . In addition , we know that we will need to loop through each line in the CSV , so we should add this capability now . Begin { $ ReturnObject = @ ( ) $ DataIn = Import - Csv - Path $ InFile } Process { foreach ( $ d in $ DataIn ) { } }

Now, we have the general structure of our function. Next, we need to figure out how to modify the existing fields and add new ones. Luckily, Export-CSV has some really great examples that seem to fit our needs. Based on those examples, we can use a PSCustomObject to create our CSV headers and their corresponding values.

Since we know that our CSV headers should be "ID,""DEPARTMENT," "COMPLETED_DATE," and "NAMES," we use these as the property names for our object. We also expect more than one entry in our CSV, so we create a temporary PSCustomOjbect. For every iteration of the loop, we add the newly created temporary object to our $ReturnObject. To do this, we add the following to our foreach loop:

Process { foreach ($d in $DataIn) { $props = [ordered]@{ ID = DEPARTMENT = COMPLETED_DATE = NAMES = } $tempObj = New-Object -TypeName PSCustomObject -Property $props $ReturnObject += $tempObj } } 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 Process { foreach ( $ d in $ DataIn ) { $ props = [ ordered ] @ { ID = DEPARTMENT = COMPLETED_DATE = NAMES = } $ tempObj = New - Object - TypeName PSCustomObject - Property $ props $ ReturnObject += $ tempObj } }

We're almost there, but first we need to access the values of the imported CSV. To do this, we reference the header of the CSV that was imported and add it to the property of our temporary custom object. You may have noticed that we are creating an [ordered] object. This is because it is required that our new CSV have values in a specified order (see above). This is how we access the values and set them in our temporary object:

Process { foreach ($d in $DataIn) { $HRId = $d.'HR_OUTPUT_ID' # Borrowed from https://blogs.technet.microsoft.com/heyscriptingguy/2014/08/27/use-powershell-to-normalize-names-before-data-import/ $name = (Get-Culture).TextInfo.ToTitleCase($d.'HR_Names').Split(',').trim() $props = [ordered]@{ ID = "$HRId".PadLeft(8, '0') DEPARTMENT = ($d.'Department and Office Number').substring(0,4).ToUpper() COMPLETED_DATE = [DateTime]::Parse($d.'Sample Test Ended/Completed On Date').ToString('MM/dd/yyyy') NAMES = "$($name[1]) $($name[0])" } $tempObj = New-Object -TypeName PSCustomObject -Property $props $ReturnObject += $tempObj } } 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 Process { foreach ( $ d in $ DataIn ) { $ HRId = $ d . 'HR_OUTPUT_ID' # Borrowed from https://blogs.technet.microsoft.com/heyscriptingguy/2014/08/27/use-powershell-to-normalize-names-before-data-import/ $ name = ( Get - Culture ) . TextInfo . ToTitleCase ( $ d . 'HR_Names' ) . Split ( ',' ) . trim ( ) $ props = [ ordered ] @ { ID = "$HRId" . PadLeft ( 8 , '0' ) DEPARTMENT = ( $ d . 'Department and Office Number' ) . substring ( 0 , 4 ) . ToUpper ( ) COMPLETED_DATE = [ DateTime ] :: Parse ( $ d . 'Sample Test Ended/Completed On Date' ) . ToString ( 'MM/dd/yyyy' ) NAMES = "$($name[1]) $($name[0])" } $ tempObj = New - Object - TypeName PSCustomObject - Property $ props $ ReturnObject += $ tempObj } }

To explain what we are doing, I'll start from the inner-most line of the foreach loop. First, we save the value in HD_OUTPUT_ID to a variable so that we can pad its value when we create our temporary object.

Next come lines borrowed from Hey Scripting Guy, which allow us to ensure that we are capitalizing correctly and that we are splitting at the ' ,'. When a string is split using the Split() method, it creates an array of values based on where it was split. To access these values, we need to access them directly with $name[0] or $name[1]. We know that in our input CSV the first value was the surname and the second value was the given name. To switch the order, we simple create a new string with the second value first and the first value second.

Then, we create a $props ordered hashtable, after which, we set the ID field in our CSV with the HRId value, but we pad the value on the left-hand side with zeros if its value is less than 8. Moving to the DEPARTMENT property, we access the current field in our input CSV, called "Department and Office Number," by accessing its property name (which is the name of the field in the CSV). The output CSV example told us that we need the first 4 letters, so we call substring() and ToUpper() methods of the string value.

Moving on to the COMPLETED_DATE property, we create a [DateTime] object and pass in the value in the existing CSV Value ($d.'Sample Test Ended/Completed on Date') and use the ToString() method. Within the ToString() method, we pass the format we want the output CSV to have and set that to the COMPLETED_DATE property (key).

Once we have our new ordered list, we create a temporary PSCustomObject and save this to our $ReturnObject. Once we have exited the foreach loop, we simply export our $ReturnObject using the Export-CSV cmdlet:

End { $returnobject | Export-Csv -Path $OutFile -Force -NoTypeInformation } 1 2 3 4 End { $ returnobject | Export - Csv - Path $ OutFile - Force - NoTypeInformation }

Here is the full function for our sample problem:

function Convert-HRCSVData { [CmdletBinding(DefaultParameterSetName='Parameter Set 1', PositionalBinding=$false)] [Alias()] [OutputType()] Param ( # Provide a input csv path [Parameter(Mandatory=$false, ValueFromPipelineByPropertyName=$true, Position=0, ParameterSetName='ParamSet1')] [ValidateNotNull()] [ValidateNotNullOrEmpty()] [Alias("CSV")] $InFile= "C:\scripts\input\HR_input.csv", # Provide a output csv path [Parameter(Mandatory=$false, ValueFromPipelineByPropertyName=$true, Position=1, ParameterSetName='ParamSet1')] [ValidateNotNull()] [ValidateNotNullOrEmpty()] $OutFile = "c:\scripts\output\HR_output_$(Get-Date -f dd-MMM-yyyy_HHmmss).csv" ) Begin { $ReturnObject = @() $DataIn = Import-Csv $InFile } Process { foreach ($d in $DataIn) { $HRId = $d.'HR_OUTPUT_ID' $name = (Get-Culture).TextInfo.ToTitleCase($d.'HR_Names').Split(',').trim() $props = [ordered]@{ ID = "$HRId".PadLeft(8, '0') DEPARTMENT = ($d.'Department and Office Number').substring(0,4).ToUpper() COMPLETED_DATE = [DateTime]::Parse($d.'Sample Test Ended/Completed On Date').ToString('MM/dd/yyyy') NAMES = "$($name[1]) $($name[0])" } $tempObj = New-Object -TypeName PSCustomObject -Property $props $ReturnObject += $tempObj } } End { $returnobject | Export-Csv -Path $OutFile -Force -NoTypeInformation } } 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 function Convert - HRCSVData { [ CmdletBinding ( DefaultParameterSetName = 'Parameter Set 1' , PositionalBinding = $ false ) ] [ Alias ( ) ] [ OutputType ( ) ] Param ( # Provide a input csv path [ Parameter ( Mandatory = $ false , ValueFromPipelineByPropertyName = $ true , Position = 0 , ParameterSetName = 'ParamSet1' ) ] [ ValidateNotNull ( ) ] [ ValidateNotNullOrEmpty ( ) ] [ Alias ( "CSV" ) ] $ InFile = "C:\scripts\input\HR_input.csv" , # Provide a output csv path [ Parameter ( Mandatory = $ false , ValueFromPipelineByPropertyName = $ true , Position = 1 , ParameterSetName = 'ParamSet1' ) ] [ ValidateNotNull ( ) ] [ ValidateNotNullOrEmpty ( ) ] $ OutFile = "c:\scripts\output\HR_output_$(Get-Date -f dd-MMM-yyyy_HHmmss).csv" ) Begin { $ ReturnObject = @ ( ) $ DataIn = Import - Csv $ InFile } Process { foreach ( $ d in $ DataIn ) { $ HRId = $ d . 'HR_OUTPUT_ID' $ name = ( Get - Culture ) . TextInfo . ToTitleCase ( $ d . 'HR_Names' ) . Split ( ',' ) . trim ( ) $ props = [ ordered ] @ { ID = "$HRId" . PadLeft ( 8 , '0' ) DEPARTMENT = ( $ d . 'Department and Office Number' ) . substring ( 0 , 4 ) . ToUpper ( ) COMPLETED_DATE = [ DateTime ] :: Parse ( $ d . 'Sample Test Ended/Completed On Date' ) . ToString ( 'MM/dd/yyyy' ) NAMES = "$($name[1]) $($name[0])" } $ tempObj = New - Object - TypeName PSCustomObject - Property $ props $ ReturnObject += $ tempObj } } End { $ returnobject | Export - Csv - Path $ OutFile - Force - NoTypeInformation } }

Join the 4sysops PowerShell group!

Your question was not answered? Ask in the forum!

1+







