Tag Archives: DevOps

Making PSScriptAnalyzer a first-class citizen in a PowerShell CI pipeline

As you already know if you have read this or this, I’m a big fan of PSScriptAnalyzer to maintain a certain standard of coding style and quality. Where this is especially powerful is inside a continuous integration pipeline because this allows us to enforce that coding standard.

In our CI pipeline, we can easily make the build fail if our code violates one or more PSScriptAnalyzer rule(s). That’s great, but the main point of continuous integration is to give quick feedback to developers about their code change(s). Continuous integration is about catching problems early to fix them early. So, Green/Red or Pass/Fail is OK, but providing meaningful information about a problem to help remediate it is better. And pretty darn important.

So now, the question is :

How can we make our CI tool publish PSScriptAnalyzer results with the information we need to remediate any violation ?

All CI tools have ways to publish test results to make them highly visible, to drill down into a test failure, and even do some reporting on these test results. Since we are talking about a PowerShell pipeline, we are most likely already using Pester to test our PowerShell code. Pester can spit out results in the same XML format as NUnit and these NUnit XML files can be consumed and published by most CI tools.

It makes a lot of sense to leverage this Pester integration as a universal CI glue and run our PSScriptAnalyzer checks as Pester tests. Let’s look at possible ways to do that.

One Pester test checking if the PSScriptAnalyzer result is null :

This is probably the simplest way to invoke PSScriptAnalyzer from Pester :

Describe 'PSScriptAnalyzer analysis' {
    
    $ScriptAnalyzerResults = Invoke-ScriptAnalyzer -Path ".\ExampleScript.ps1" -Severity Warning
    
    It 'Should not return any violation' {
        $ScriptAnalyzerResults | Should BeNullOrEmpty
    }
}
  

 
Here, we are checking all the rules which have a “Warning” severity within one single test. Then, we rely on the fact that if PSScriptAnalyzer returns something, it means that they were at least one violation and if PSScriptAnalyzer returns nothing, it’s all good.

There are 2 problems here :

  • We are evaluating a whole bunch of rules in a single test, so the test name cannot tell us which rule was violated
  • As soon as there are more than one violation, the Pester message gives us useless information

How useless ? Well, let’s see :

useless-pester-stacktrace
 
The Pester failure message gives us the object type of the PSScriptAnalyzer results, instead of their contents. This does not provide what we need to locate and remediate the problem, like the name of the file which violated the rule and the line number in that file where the violation is located.

One Pester test per PSScriptAnalyzer rule :

This is a pretty typical (and better) way of running PSScriptAnalyzer checks via Pester.

Describe 'PSScriptAnalyzer analysis' {
    
    $ScriptAnalyzerRules = Get-ScriptAnalyzerRule -Name "PSAvoid*"

    Foreach ( $Rule in $ScriptAnalyzerRules ) {

        It "Should not return any violation for the rule : $($Rule.RuleName)" {
            Invoke-ScriptAnalyzer -Path ".\ExampleScript.ps1" -IncludeRule $Rule.RuleName |
            Should BeNullOrEmpty
        }
    }
}
  

 
In this case, the first step is to get a list of the rules that we want to evaluate. Here, I changed the list of rules to : all rules which have a name starting with “PSAvoid”.
This is just to show that we can filter the rules by name, as well as by severity.

Then, we loop through this list of rules and have a Pester test evaluating each rule, one by one. As we can see below, the output is much more useful :

psscriptanalyzer-by-rule
 
This is definitely better but we still encounter the same issue as before because there were more than one violation for that “PSAvoidUsingWMICmdlet” rule. So we still don’t get the file name and the line number.

We could use a nested loop : for each rule, we would loop through each file and evaluate that rule against each file one-by-one. That would be more granular and reduce the risk of this particular issue. But if a single file violated the same rule more than once, we would still have the same problem.

So, I decided to go take a different direction to address this problem : taking the output from PSScriptAnalyzer and converting it to a test result file, using the same XML schema as Pester and NUnit.

Converting PSScriptAnalyzer output to a test result file :

For that purpose, I wrote a function named Export-NUnitXml, which is available on GitHub in this module.

Here are the high-level steps of what Export-NUnitXml does :

  • Take the output of PSScriptAnalyzer as its input (zero or more objects of the type [Microsoft.Windows.PowerShell.ScriptAnalyzer.Generic.DiagnosticRecord]
  • Create an XML document containing a “test-case” node for each input object(s).
  • Write this XML document to the file specified via the Path parameter.

Here is an example of how we can use this within a build script (in Appveyor.com as the CI tool, in this case) :

$ScriptAnalyzerRules = Get-ScriptAnalyzerRule -Severity Warning
$ScriptAnalyzerResult = Invoke-ScriptAnalyzer -Path ".\CustomPSScriptAnalyzerRules\ExampleScript.ps1" -IncludeRule $ScriptAnalyzerRules
If ( $ScriptAnalyzerResult ) {
  
    $ScriptAnalyzerResultString = $ScriptAnalyzerResult | Out-String
    Write-Warning $ScriptAnalyzerResultString
}
Import-Module ".\Export-NUnitXml\Export-NUnitXml.psm1" -Force
Export-NUnitXml -ScriptAnalyzerResult $ScriptAnalyzerResult -Path ".\ScriptAnalyzerResult.xml"

(New-Object 'System.Net.WebClient').UploadFile("https://ci.appveyor.com/api/testresults/nunit/$($env:APPVEYOR_JOB_ID)", (Resolve-Path .\ScriptAnalyzerResult.xml))

If ( $ScriptAnalyzerResult ) {        
    # Failing the build
    Throw "Build failed because there was one or more PSScriptAnalyzer violation. See test results for more information."
}
   

 
And here is the result in Appveyor :

appveyor-overview
 
Just by reading the name of the test case, we get the essential information : the rule name, the file name and even the line number. Pretty nice, huh ?

Also, we can expand any failed test (by clicking on it) to get additional information. For example, the last 2 tests are expanded below :

appveyor-test-details
 
The “Stacktrace” section provides additional details, like the rule severity and the actual offending code. Another nice touch is that the “Message” section gives us the rule message, which normally provides an actionable recommendation to remediate the problem.

But, what if PSScriptAnalyzer returns nothing ?
Export-NUnitXml does handle this scenario gracefully because its ScriptAnalyzerResult parameter accepts $Null.
In this case, the test result file will contain only one test case and this test passes.

Let’s test this :

Import-Module -Name 'PsScriptAnalyzer' -Force
Import-Module ".\Export-NUnitXml\Export-NUnitXml.psm1" -Force
Export-NUnitXml -ScriptAnalyzerResult $Null -Path ".\ScriptAnalyzerResult.xml"

(New-Object 'System.Net.WebClient').UploadFile("https://ci.appveyor.com/api/testresults/nunit/$($env:APPVEYOR_JOB_ID)", (Resolve-Path .\ScriptAnalyzerResult.xml))
  

 
Here is what it looks like in Appveyor:

appveyor-passed-psscriptanalyzer-tests
 
There’s nothing more beautiful than a green test…

So now, as developers, we not only have quick feedback on our adherence to coding standards, but we also get actionable guidance on how to improve.
And remember, this NUnit XML format is widely supported in the CI/CD world, so even though I only showed Appveyor, this would work similarly in TeamCity, Microsoft VSTS, and others…

Where to start your automation efforts ? An analogy for IT infrastructure folks

Nowadays, many IT shops look a lot like the first few pages of The Phoenix Project : lack of automation and communication which leads to manual and unpredictable deployments, which in turn, leads to lots of firefighting. Fortunately, many of these shops are starting to understand the value of implementing some of the DevOps principles.

Of all the DevOps principles, automation is probably the easiest to grasp so that’s where most IT organisations start. I’m not saying this is the best starting point, but this is the most typical starting point of a DevOps journey. At this point, IT folks who are traditionally infrastructure-focused are asked to move their focus “up-the-stack” and towards automation. This is a difficult shift to make, especially for those who are not willing to adapt and learn new skills. But even for those who are willing to make that shift, it can be confusing.

OK, I’m going to automate stuff, but where do I start ? What should I automate first ?


I’ll attempt to answer that question drawing on my 8 years of hard-learned lessons in technical support and using a metaphor that most infrastructure-focused IT folks can relate to.

Storage performance troubleshooting :

Users are complaining that a specific application is “very slow”, so you go and check the resources usage for the virtual machine running that particular application. You notice frequent storage latency spikes of up to 3 seconds ! Now, you need to identify where this latency is coming from. It is easier said than done, storage infrastructure is complex, especially SAN-based storage. The latency can come from many different areas : from within the virtual machine (or the application itself), from the hypervisor, from the SAN fabric (cables and switches) or from the storage array.

But ultimately, whatever the transport protocol, it boils down to encapsulated SCSI commands. So you need to consider the storage system as a whole and try to picture it as a pipeline transporting SCSI commands from a VM to a disk. Any component which is struggling to keep up will eventually slow down the entire pipeline. Why ?

Because the SCSI commands, or frames, or packets, or whatever the unit of work is at this stage, are going to be stored in the queue (or buffer) of the struggling component. And what happens when the struggling component’s queue is full ? The queue of another component, located upstream in the pipeline, starts to fill up. In the meantime, any component located downstream in the pipeline is left with no work to do, waiting, waiting, waiting…

So you need to check the latency metrics at each components in the pipeline and single out which one introduces the most latency : the big bad bottleneck. It’s not always obvious and sometimes the victim can be confused with the culprit, but that’s where your expertise comes in. That’s why you are paid the big bucks, right ?

Let’s say you have identified the latency introduced in each areas, like so :

  • Virtual machine : 10 milliseconds
  • Hypervisor : 25 milliseconds
  • SAN fabric : 50 milliseconds
  • Storage array : variable, up to almost 3 seconds

Now, if you reduce the latency at the storage array level by 2 seconds, how much latency improvement are you going to see on the whole storage system end-to-end ?
2 seconds.
If you reduce the latency the SAN fabric level by 30 milliseconds, how much latency improvement are you going to see on the whole storage system end-to-end ?
None.
This seems quite obvious but this is to emphasize a very important point : the energy, time and resources spent on anything else than the bottleneck are 100% waste. So where do you start ? You identify the bottleneck and focus all your efforts on improving the performance of that component.

After further investigation, it turns out there was 2 dead disks in the RAID set where the virtual machine was stored. You replace the 2 disks and now the latency at the storage array level is only 30 milliseconds. That’s great, you have eliminated the first bottleneck, but now you have new one : the SAN fabric. Yep, in any pipeline, at any given time, there is one (and only one) bottleneck. So now, you can focus all your efforts on the fabric, unless the current performance is acceptable for your application(s).

This is in a nutshell how to prioritize performance improvements in a storage system, but what the hell does it have to do with automation ?
Everything.

IT service delivery as a pipeline :

pipeline
 
The “Lean” thinking (and all the stuff that came from it, like Agile) originated from applying the principles of Lean manufacturing (pioneered by Toyota) to the software industry. It compares a software delivery pipeline with a manufacturing assembly line. This is a very powerful analogy and many fascinating learnings have been drawn from it. If you are not convinced, I suggest you read this : Lean Software Development: An Agile Toolkit.

With the emergence of DevOps and “Lean IT“, some people started to realize that it doesn’t necessarily have to be software, these principles can be applied to IT operations as well. IT service delivery can be considered as a pipeline, pretty much like a storage system. It transports a continuous flow of tickets, feature requests, change requests, or work items instead of SCSI commands, frames or packets. The destination of the services it delivers is one or more end-users, instead of a disk.

The components involved in the different stages along the pipeline vary a lot depending on the type and scope of the request : application team, systems team, maybe network and storage team, security team, possibly some kind of QA, a manager and/or procurement approval depending on the cost, etc… It can be potentially be very complex, but you need to gather an exhaustive list of all the parts involved in the pipeline, even outside of IT. This requires to take a step back from your IT bubble and look at the bigger picture, which is not easy, but it is important to have a holistic view of “the system”.

This mental model of a pipeline allows to “troubleshoot” the performance of your IT service delivery. Just like any pipeline, the 2 main ways to measure performance are :

  • Throughput (the amount of work that can go through the pipeline in a given amount of time)
  • Latency (the time it takes for a single unit of work to go from the beginning of the pipeline to the end)

Similarly to most storage admins, Agile and DevOps tend to focus more on latency as a performance indicator. Agile talks a lot about “velocity” and DevOps uses the term “lead times“. So again, you need to determine the latency at each stage of the IT service delivery pipeline and identify the component which adds the most latency : the big bad bottleneck.

Like in a storage infrastructure, bottlenecks tend to manifest themselves as a queue filling up with work in progress. But this work waiting to be processed is not often obvious in the context of IT services, because you cannot touch it (except for hardware). You can easily see car parts piling up in a warehouse, but can you see the 1s and 0s piling up ? A nice way to make work more visible is to use tools like Kanban boards.

When the bottleneck is identified, you know where to focus your efforts. Not just your automation efforts, ALL efforts should be focused on addressing the bottleneck. Yes, automation may very well be only a part of the solution and in some cases, it might even be a very small portion of the solution. This is because technology may only be a small part of the problem. The bigger problem may very well be lack of communication, or lack of skills, or information retention, or overly complex processes, etc… This means your automation efforts need to be integrated into a broader effort to eliminate the bottleneck. It is about IT and “the business” working together towards a common goal.

When the latency of the struggling component in the IT service delivery pipeline has been reduced enough to not be the bottleneck anymore (using automation and any other means), you know that you have improved the performance of your entire IT organization, and by how much. By the way, this information can come in handy for your next pay review. After that, you can move on and focus all your energy (and newly acquired automation skills) on the next bottleneck.

So there is no ready-made recipe to tell you where to start your automation efforts. It is not easy because every business is different and because it requires a shift in mindset, thinking about the bigger picture. But hopefully, this article provides some guidance on how to prioritize your efforts. It is very much about optimizing the impact of your work by focusing on relieving the most crippling pain your IT organization has, and by doing so, making yourself more valuable.

If you want more on these topics, here is a pretty deep perspective on automation, infrastructure (networking in this case) and bottlenecks.

A Boilerplate for Unit testing DSC resources with Pester

Unit testing PowerShell code is slowly but surely becoming mainstream. Pester, the awesome PowerShell testing framework is playing a big part in that trend.
But why the hell would you write more PowerShell code to test your PowerShell code ? Because :

  • It can give you a better understanding of your code, its design, its assumptions and its behaviour.
     
  • When you make changes and the unit tests pass, you can be pretty confident that you didn’t break anything.
    This makes changes less painful and scary and this is a very important notion in DevOps : removing fear and friction to make changes painless, easy, fast and even … boring.
     
  • It helps writing more robust , less buggy code.
     
  • Given the direction that PowerShell community is taking and the way the DevOps movement is permeating the IT industry, this is becoming a valuable skill.
     
  • There is an initial learning curve and it takes time, effort and discipline, but if you do it often enough, it can quickly become second nature.
     

To help reduce this time and effort, I wanted to a build Pester script template which could be reused for unit testing any DSC resource. After all, DSC resources have a number of specific requirements and best practices, for example : Get-TargetResource should return a hashtable, or Test-TargetResource should return a boolean… So we can write tests for all these requirements and these tests can be readily reused for any other DSC resource (non class-based).

Without further ado, here is the full script (which is also available on GitHub) and then we’ll elaborate on the main bits and pieces :

$Global:DSCResourceName = 'My_DSCResource'  #<----- Just change this

Import-Module "$($PSScriptRoot)\..\..\DSCResources\$($Global:DSCResourceName)\$($Global:DSCResourceName).psm1" -Force

# Helper function to list the names of mandatory parameters of *-TargetResource functions
Function Get-MandatoryParameter {
    [CmdletBinding()]
    Param(
        [Parameter(Mandatory=$True)]
        [string]$CommandName
    )
    $GetCommandData = Get-Command "$($Global:DSCResourceName)\$CommandName"
    $MandatoryParameters = $GetCommandData.Parameters.Values | Where-Object { $_.Attributes.Mandatory -eq $True }
    return $MandatoryParameters.Name
}

# Getting the names of mandatory parameters for each *-TargetResource function
$GetMandatoryParameter = Get-MandatoryParameter -CommandName "Get-TargetResource"
$TestMandatoryParameter = Get-MandatoryParameter -CommandName "Test-TargetResource"
$SetMandatoryParameter = Get-MandatoryParameter -CommandName "Set-TargetResource"

# Splatting parameters values for Get, Test and Set-TargetResource functions
$GetParams = @{
    
}
$TestParams = @{
    
}
$SetParams = @{
    
}

Describe "$($Global:DSCResourceName)\Get-TargetResource" {
    
    $GetReturn = & "$($Global:DSCResourceName)\Get-TargetResource" @GetParams

    It "Should return a hashtable" {
        $GetReturn | Should BeOfType System.Collections.Hashtable
    }
    Foreach ($MandatoryParameter in $GetMandatoryParameter) {
        
        It "Should return a hashtable with key named $MandatoryParameter" {
            $GetReturn.ContainsKey($MandatoryParameter) | Should Be $True
        }
    }
}

Describe "$($Global:DSCResourceName)\Test-TargetResource" {
    
    $TestReturn = & "$($Global:DSCResourceName)\Test-TargetResource" @TestParams

    It "Should have the same mandatory parameters as Get-TargetResource" {
        # Does not check for $True or $False but uses the output of Compare-Object.
        # That way, if this test fails Pester will show us the actual difference(s).
        (Compare-Object $GetMandatoryParameter $TestMandatoryParameter).InputObject | Should Be $Null
    }
    It "Should return a boolean" {
        $TestReturn | Should BeOfType System.Boolean
    }
}

Describe "$($Global:DSCResourceName)\Set-TargetResource" {
    
    $SetReturn = & "$($Global:DSCResourceName)\Set-TargetResource" @SetParams

    It "Should have the same mandatory parameters as Test-TargetResource" {
        (Compare-Object $TestMandatoryParameter $SetMandatoryParameter).InputObject | Should Be $Null
    }
    It "Should not return anything" {
        $SetReturn | Should Be $Null
    }
}

 
That’s a lot of information so let’s break it down into more digestible chunks :

$Global:DSCResourceName = 'My_DSCResource'  #<----- Just change this

 
The “My_DSCResource” string is only part in the entire script which needs to be changed from one DSC resource to another. All the rest can be reused for any DSC resource.

Import-Module "$($PSScriptRoot)\..\..\DSCResources\$($Global:DSCResourceName)\$($Global:DSCResourceName).psm1" -Force

The relative path to the module containing the DSC resource is derived from a standard folder structure, with a “Tests” folder at the root of the module and a “Unit” subfolder, containing the resulting unit tests script, for example :

O:\> tree /F "C:\Git\FolderPath\DscModules\DnsRegistration"
Folder PATH listing for volume OS

│   DnsRegistration.psd1
│
├───DSCResources
│   └───DnsRegistration
│       │   DnsRegistration.psm1
│       │   DnsRegistration.schema.mof
│       │
│       └───ResourceDesignerScripts
│               GenerateDnsRegistrationSchema.ps1
│
└───Tests
    └───Unit
            DnsRegistration.Tests.ps1

 
We load the module because we’ll need to use the 3 functions it contains : Get-TargetResource, Set-TargetResource and Test-TargetResource.

By the way, note that this script is divided into 3 Describe blocks : this is a more or less established convention in unit testing with Pester : one Describe block per tested function. The “Force” parameter of Import-Module is to make sure that, even if the module was already loaded, we get the latest version of the module.

Function Get-MandatoryParameter {
    [CmdletBinding()]
    Param(
        [Parameter(Mandatory=$True)]
        [string]$CommandName
    )
    $GetCommandData = Get-Command "$($Global:DSCResourceName)\$CommandName"
    $MandatoryParameters = $GetCommandData.Parameters.Values | Where-Object { $_.Attributes.Mandatory -eq $True }
    return $MandatoryParameters.Name
}

 
This is a helper function used to get the mandatory parameter names for the *-TargetResource functions. If you use a more than a few helper functions in your unit tests, then you should probably gather them in a separate script or module.

# Splatting parameters values for Get, Test and Set-TargetResource functions
$GetParams = @{
     
}
$TestParams = @{
     
}
$SetParams = @{
     
}

 
These are placeholders to be completed with the parameters and values for Get-TargetResource, Test-TargetResource and Set-TargetResource, respectively. Splatting makes them more readable, especially for resources that have many parameters. We might use the same parameters and parameter values for all 3 functions, in that case, we can consolidate these 3 hashtables into a single one.

$GetReturn = & "$($Global:DSCResourceName)\Get-TargetResource" @GetParams

 
Specifying the resource name with the function allows to unambiguously call the Get-TargetResource function from the DSC resource we are currently testing and not the one from another resource.

It "Should return a hashtable" {
        $GetReturn | Should BeOfType System.Collections.Hashtable
    }

 
The first actual test ! This is validating that Get-TargetResource returns a object of the type [hashtable]. The “BeOfType” operator is designed specifically for verifying the type of an object so it’s a great fit.

Foreach ($MandatoryParameter in $GetMandatoryParameter) {
        
        It "Should return a hashtable with key named $MandatoryParameter" {
            $GetReturn.ContainsKey($MandatoryParameter) | Should Be $True
        }
    }

 
An article from the PowerShell Team says this :

The Get-TargetResource returns the status of the modeled entities in a hash table format. This hash table must contain all properties, including the Read properties (along with their values) that are defined in the resource schema.

I’m not sure this is a hard requirement because this is not enforced, and Get-TargetResource is not automatically called by the DSC engine. So this may not be ideal but we are getting the names of the mandatory parameters of Get-TargetResource and we check that the hashtable returned by Get-TargetResource has a key matching each of these parameters. Maybe, we could check against all parameters, not just the mandatory ones ?

Now, let’s turn our attention to Test-TargetResource :

    $TestReturn = & "$($Global:DSCResourceName)\Test-TargetResource" @TestParams

    It "Should have the same mandatory parameters as Get-TargetResource" {
        (Compare-Object $GetMandatoryParameter $TestMandatoryParameter).InputObject | Should Be $Null
    }

 
This test is validating that the mandatory parameters of Test-TargetResource are the same as for Get-TargetResource. There is a PSScriptAnalyzer rule for that, with an “Error” severity, so we can safely assume that this is a widely accepted and important best practice :

GetSetTest Parameters
 
Reading the name of this “It” block, we could assume that it is checking against $True or $False. But here, we use Compare-Object and validate that there is no difference between the 2 lists of mandatory parameters. This is to make the message we get in case the test fails more useful : it will tell us the offending parameter name(s).

    It "Should return a boolean" {
        $TestReturn | Should BeOfType System.Boolean
    }

 
The function Test-TargetResource should always return a boolean. This is a well known requirement and this is also explicitly specified in the templates generated by xDSCResourceDesigner, so there is no excuse for not knowing/following this rule.

Now, it is time to test Set-TargetResource :

    It "Should have the same mandatory parameters as Test-TargetResource" {
        (Compare-Object $TestMandatoryParameter $SetMandatoryParameter).InputObject | Should Be $Null
    }

 
The same as before, but this time we validate that the mandatory parameters of the currently tested function (Set-TargetResource) are the same as for Test-TargetResource.

    It "Should not return anything" {
        $SetReturn | Should Be $Null
    }

 
Set-TargetResource should not return anything. Again, you don’t have to take my word for it, PSScriptAnalyzer is our source of truth :

Set should not return anything
 
That’s it for the script. But then, a boilerplate is more useful when it is readily available as a snippet on your IDE of choice. So I also converted this boilerplate into a Visual Studio Code snippet, this is the first snippet in the custom snippet file I made available here.

The path of Visual Studio Code PowerShell snippet file is : %APPDATA%\Code\User\snippets\PowerShell.json.
Or, for those of us using the PowerShell extension, we can modify the following file : %USERPROFILE%.vscode\extensions\ms-vscode.PowerShell-0.6.1\snippets\PowerShell.json.

Obviously, this set of tests is pretty basic and doesn’t cover the code written specifically for a given resource, but it’s a pretty good starting point. This allows to write basic unit tests for our DSC resources in just a few minutes, so now, there’s no excuse for not doing it.

How to create a custom rule for PSScriptAnalyzer

As you probably already know, PSScriptAnalyzer is a static code analysis tool, which checks PowerShell code against rules representing best practices and style guidelines. This is a fantastic tool to set coding style, consistency and quality standards, and if we want to, we can easily enforce these standards within a build pipeline.

The PowerShell community was very much involved in the definition of PSScriptAnalyzer rules, so these rules really make a lot of sense as general guidelines and they are widely accepted by the PowerShell community. However, a given company or project might have specific coding standards which may contain different or more specific rules. Or maybe, you feel like Silicon Valley’s Richard regarding Tabs vs Spaces.

Fortunately, PSScriptAnalyzer allows us to create and use custom rules. In this article, we are going to learn how to do that with a simple example. Let’s say we have coding standards which specifies that all variables names should follow a consistent capitalization style, in particular : PascalCasing. So we are going to write a PSScriptAnalyzer rule to check our code against that convention in the form of a function.

To write this function, our starting point should be this documentation page.
First, how are we going to name our function ? If we look at the CommunityAnalyzerRules module, we see that all the functions names use the verb “Measure“. Why ? I don’t know, but it seems like a sensible convention to follow. That way, if we have multiple rules stored in a single module, we can export all of of them by simply adding the following in the module :

Export-ModuleMember -Function Measure-*

 
So, given our rule is about PascalCasing, the function name “Measure-PascalCase” makes sense.

Next, we need a proper comment-based help for our function. This looks like this :

Function Measure-PascalCase {
<#
.SYNOPSIS
    The variables names should be in PascalCase.

.DESCRIPTION
    Variable names should use a consistent capitalization style, i.e. : PascalCase.
    In PascalCase, only the first letter is capitalized. Or, if the variable name is made of multiple concatenated words, only the first letter of each concatenated word is capitalized.
    To fix a violation of this rule, please consider using PascalCase for variable names.

.EXAMPLE
    Measure-PascalCase -ScriptBlockAst $ScriptBlockAst

.INPUTS
    [System.Management.Automation.Language.ScriptBlockAst]

.OUTPUTS
    [Microsoft.Windows.PowerShell.ScriptAnalyzer.Generic.DiagnosticRecord[]]

.NOTES
    https://msdn.microsoft.com/en-us/library/dd878270(v=vs.85).aspx
    https://msdn.microsoft.com/en-us/library/ms229043(v=vs.110).aspx
#>

 
The DESCRIPTION part of the help is actually used by PSScriptAnalyzer so it is important. It should contain an explanation of the rule, as well as a brief explanation of how to remediate any violation of the rule. Here, we don’t want to assume that all users know what PascalCase means, so we give a succinct but (hopefully) clear definition of PascalCase.

In the INPUTS field, we tell the user that the only parameter for our function takes an object of the type : [System.Management.Automation.Language.ScriptBlockAst], but it could be other types of AST objects. But wait, What is AST ?

The short(ish) version is that PowerShell 3.0 introduced a new parser and that Parser relies on AST to expose various elements of the PowerShell language as objects. This facilitates parsing PowerShell code and extract objects corresponding to language elements like : variables, function definitions, parameter blocks, parameters, arrays, hashtables, Foreach statements, If statements, the list goes on and on … And PSScriptAnalyzer relies heavily on this AST-based parser.

In the OUTPUTS field, we explicitly tell the user that the function will return one or more objects of the type [Microsoft.Windows.PowerShell.ScriptAnalyzer.Generic.DiagnosticRecord[]]. But the actual user will be PSScriptAnalyzer, so this is really a contract between our function and PSScriptAnalyzer. This is more formally declared with the following function attribute :

[OutputType([Microsoft.Windows.PowerShell.ScriptAnalyzer.Generic.DiagnosticRecord[]])]

 
But even with this declaration, PowerShell doesn’t enforce that. So it’s our responsibility to ensure our code doesn’t return anything else, otherwise, PSScriptAnalyzer will not be happy.

Now it is time to tackle the code inside our function. Looking at the CommunityAnalyzerRules module most functions have the same basic structure :

#region Define predicates to find ASTs.

[ScriptBlock]$Predicate = {
    Param ([System.Management.Automation.Language.Ast]$Ast)

    [bool]$ReturnValue = $False
    If ( ... ) {

        ...

    }
    return $ReturnValue
}
#endregion

#region Find ASTs that match the predicates.
[System.Management.Automation.Language.Ast[]]$Violations = $ScriptBlockAst.FindAll($Predicate, $True)

If ($Violations.Count -ne 0) {

    Foreach ($Violation in $Violations) {

        $Result = New-Object `
                -Typename "Microsoft.Windows.PowerShell.ScriptAnalyzer.Generic.DiagnosticRecord" `
                -ArgumentList  ...
          
        $Results += $Result
    }
}
return $Results
#endregion

 
We don’t have to follow that structure but it is a very helpful scaffolding. As we can see above, the function is divided in 2 logical parts: the first one is where we define one or more predicates corresponding to our rule and the second one is where we actually use the predicate(s) against input (PowerShell code) to identify any violation(s) of our rule.

Defining predicates

First, what is a predicate ?
It is a scriptblock which returns $True or $False and it is used to filter objects. We have a bunch of objects that we feed to our predicate then, we keep the objects for which the predicate returned $True and we filter out the objects for which the predicate returned $False. Sounds complicated ? It’s not, and you are using predicates. All. The. Time :

C:\> $ThisIsAPredicate = { $_.Name -like "*.ps*1" }
C:\> Get-ChildItem -Recurse | Where-Object $ThisIsAPredicate

 
In the context of our PSScriptAnalyzer rule function, the predicate is used to identify violations of our rule. Any piece of PowerShell code which returns $True when fed to our predicate has a violation of our rule. We can use multiple methods to detect violations, so we can define multiple predicates if we need/want to. Here, this is a simple example so we are going to define a single predicate.

Our predicate should take input (pieces of PowerShell code) via a parameter. Here, the parameter is named Ast and it takes objects of the type [System.Management.Automation.Language.Ast]. This is the generic class for AST, this allows the predicate’s parameter to accept objects of child classes like [System.Management.Automation.Language.ScriptBlockAst], [System.Management.Automation.Language.StatementAst], etc…

            [ScriptBlock]$Predicate = {
                Param ([System.Management.Automation.Language.Ast]$Ast)

                ...

 
Our rule for PascalCasing relates only to variable names, so we first need to identify variables. What is most relevant for naming is when variables are defined, or assigned a value, not really when they are referenced. So the arguably best way to identify variables for our particular purpose is to identify variable assignments, like so :

If ($Ast -is [System.Management.Automation.Language.AssignmentStatementAst]) {

    ...

 
Next, we need to identify any variable names which don’t follow PascalCasing. For that, we’ll use the comparison operator -cnotmatch and a regex. As you probably know, PowerShell is not case sensitive. But our rule is all about casing, it is case hypersensitive. This makes the “c” in -cnotmatch crucial for our predicate to work :

[System.Management.Automation.Language.AssignmentStatementAst]$VariableAst = $Ast
    If ($VariableAst.Left.VariablePath.UserPath -cnotmatch '^([A-Z][a-z]+)+$') {
        $ReturnValue = $True
    }

 
To extract only the variable names from our variable assignment objects, we take their “Left” property (what’s on the left side of the assignment operator), then the “VariablePath” property and then the “UserPath” nested property. This gives us only the variable name as a [string]. If that string doesn’t match our regular expression, the predicate returns $True, which means there is a violation.

A brief explanation of the regex used above ([A-Z][a-z]+) :
this means one upper case letter followed by one or more lower case letter(s). This particular pattern can be repeated so we put it between parenthesis and append a “+”. And all this should strictly between the beginning of the string “^” and the end of the string “$”.

Off course, this detection method is limited because there is no intelligence to detect words of the English language (or any language) which may be concatenated to form the variable name :

PS C:\> "FirstwordSecondword" -cmatch '^([A-Z][a-z]+)+$'
True

PS C:\> "FirstwoRdsecoNdword" -cmatch '^([A-Z][a-z]+)+$'
True

 
Also, I’m not a big fan of using digits in variable names but if you want the rule to allow that, you can use the following regex :

PS C:\> "Word1Word2" -cmatch '^([A-Z]\w+)+$'
True

 

Using the predicate to detect violations

Now, we can use our predicate against whatever PowerShell code is fed to our Measure-PascalCase function via its $ScriptBlockAst parameter. The input PowerShell code is an object of the type [System.Management.Automation.Language.ScriptBlockAst], so like most AST objects, it has a FindAll() method which we can use to find all the elements within that object which match a predicate.

[System.Management.Automation.Language.Ast[]]$Violations = $ScriptBlockAst.FindAll($Predicate, $True)

 
The second parameter of the FindAll() method ($True) tells it to search recursively in nested elements.

Now, for any violation of our rule, we need to create an object of the type [Microsoft.Windows.PowerShell.ScriptAnalyzer.Generic.DiagnosticRecord], because PSScriptAnalyzer expects our function to return an array of object(s) of that specific type :

Foreach ($Violation in $Violations) {

    $Result = New-Object `
            -Typename "Microsoft.Windows.PowerShell.ScriptAnalyzer.Generic.DiagnosticRecord" `
            -ArgumentList "$((Get-Help $MyInvocation.MyCommand.Name).Description.Text)",$Violation.Extent,$PSCmdlet.MyInvocation.InvocationName,Information,$Null
          
    $Results += $Result
}

 
Pay particular attention to the 5 values passed to the -ArgumentList parameter of the cmdlet New-Object. To see what each of these values correspond to, we can have a look at the constructor(s) for this class :

C:\> [Microsoft.Windows.PowerShell.ScriptAnalyzer.Generic.DiagnosticRecord]::new

OverloadDefinitions
-------------------
Microsoft.Windows.PowerShell.ScriptAnalyzer.Generic.DiagnosticRecord new()
Microsoft.Windows.PowerShell.ScriptAnalyzer.Generic.DiagnosticRecord new(string message,
System.Management.Automation.Language.IScriptExtent extent, string ruleName,
Microsoft.Windows.PowerShell.ScriptAnalyzer.Generic.DiagnosticSeverity severity, string scriptName, string ruleId)

 
For the “Message” property of our [DiagnosticRecord] objects, hard-coding a relatively long message would not look nice, so here, we are reusing our carefully crafted description from the comment-based help. We don’t have to do this, but that way, we don’t reinvent the wheel.

Then, each resulting object is added to an array : $Results.
Finally, when we are done processing violations, we return that array for PSScriptAnalyzer‘s consumption :

            }
            return $Results
            #endregion
        }

 
That’s it. The module containing the full function is on this GitHub page.

Now, let’s use our custom rule with PSScriptAnalyzer against an example script :

C:\> Invoke-ScriptAnalyzer -Path .\ExampleScript.ps1 -CustomRulePath .\MBAnalyzerRules.psm1 |
 Select-Object RuleName, Line, Message | Format-Table -AutoSize -Wrap

RuleName                           Line Message
--------                           ---- -------
MBAnalyzerRules\Measure-PascalCase   15 Variable names should use a consistent capitalization style, i.e. : PascalCase.
                                        In PascalCase, only the first letter is capitalized. Or, if the variable name
                                        is made of multiple concatenated words, only the first letter of each
                                        concatenated word is capitalized.
                                        To fix a violation of this rule, please consider using PascalCase for variable
                                        names.
MBAnalyzerRules\Measure-PascalCase   28 Variable names should use a consistent capitalization style, i.e. : PascalCase.
                                        In PascalCase, only the first letter is capitalized. Or, if the variable name
                                        is made of multiple concatenated words, only the first letter of each
                                        concatenated word is capitalized.
                                        To fix a violation of this rule, please consider using PascalCase for variable
                                        names.
MBAnalyzerRules\Measure-PascalCase   86 Variable names should use a consistent capitalization style, i.e. : PascalCase.
                                        In PascalCase, only the first letter is capitalized. Or, if the variable name
                                        is made of multiple concatenated words, only the first letter of each
                                        concatenated word is capitalized.
                                        To fix a violation of this rule, please consider using PascalCase for variable
                                        names.
MBAnalyzerRules\Measure-PascalCase   88 Variable names should use a consistent capitalization style, i.e. : PascalCase.
                                        In PascalCase, only the first letter is capitalized. Or, if the variable name
                                        is made of multiple concatenated words, only the first letter of each
                                        concatenated word is capitalized.
                                        To fix a violation of this rule, please consider using PascalCase for variable
                                        names.

 
That’s cool, but we probably want to see the actual variable names which are not following our desired capitalization style. We can obtain this information like so :

VariableNames
 
We can see that in the case of this script (pun intended), the case of variable names is all over the place, and we can easily go and fix it.

Integrating PSScriptAnalyzer in an Appveyor Continuous Integration pipeline

Many of us who are writing PowerShell code are using the free (and awesome) Appveyor service for Continuous Integration (especially for personal projects). And most of us use this to run Pester tests. Automated testing is great, it allows to set a certain standard of code quality without slowing down code delivery. But, this is just checking that the code behaves as we intended to.

What about code consistency, style, readability and following best practices ?

This is where a static code analysis tool like PSScriptAnalyzer comes in. Even though PSScriptAnalyzer is a perfect fit in a PowerShell “build” process, searching the web for “integrating PSScriptAnalyzer and Appveyor” doesn’t yield very helpful results. So here is the solution I came up with :

version: 1.0.{build}

os: WMF 5

# Skip on updates to the readme
skip_commits:
  message: /readme*/
  
install:
  - ps: Install-PackageProvider -Name NuGet -Force
  - ps: Install-Module PsScriptAnalyzer -Force
  
build: false

test_script:
  - ps: |
      Add-AppveyorTest -Name "PsScriptAnalyzer" -Outcome Running
      $Results = Invoke-ScriptAnalyzer -Path $pwd -Recurse -Severity Error -ErrorAction SilentlyContinue
      If ($Results) {
        $ResultString = $Results | Out-String
        Write-Warning $ResultString
        Add-AppveyorMessage -Message "PSScriptAnalyzer output contained one or more result(s) with 'Error' severity.`
        Check the 'Tests' tab of this build for more details." -Category Error
        Update-AppveyorTest -Name "PsScriptAnalyzer" -Outcome Failed -ErrorMessage $ResultString
        
        # Failing the build
        Throw "Build failed"
      }
      Else {
        Update-AppveyorTest -Name "PsScriptAnalyzer" -Outcome Passed
      }

This the content of my appveyor.yml file, which is the file from which Appveyor gets the build configuration.

Line 3 : This indicates from which VM template the build agent will be deployed. As its name indicates, this allows to have a build agent running in a VM with PowerShell version 5. If you believe only what you can see, add $PSVersionTable in the appveyor.yml and check the result in the build console. PowerShell 5 means we can easily add PowerShell scripts, modules and DSC resources to our build agent from the PowerShell Gallery using PackageManagement.

Line 10-11 : This is exactly what we do here. But first, because the PowerShell Gallery relies on NuGet, we need to install the NuGet provider. Then, we can install any PowerShell module we want from the PowerShell Gallery, PsScriptAnalyzer in this case. We didn’t specify the repository because the PowerShell Gallery is the default one.

Line 13 : This refers specifically to MSBuild and we don’t need or want MSBuild for a PowerShell project.

Line 15-End : This is where all the PSScriptAnalyzer stuff goes. So from an Appveyor point of view, this will be a test. Even though static code analysis is not testing, it kinda makes sense : we are assessing the code against a set of rules which represent a certain standard and we want a “Pass” or a “Fail” depending on whether the code meets the standard or not.

Line 16 : In YAML, the pipe character “|” allows values to span multiple lines. This is very convenient for code blocks, like here. That way, we don’t need to add “- ps:” at the beginning of each line.

Line 17 : Appveyor doesn’t have a direct integration with PSScriptAnalyzer like it has for some testing frameworks (NUnit, MSTest, etc…) but it’s OK. The Appveyor worker (the actual build agent) provides a REST API and even a few PowerShell cmdlets leveraging this API. One of these cmdlets is Add-AppveyorTest. Using this cmdlet, we are adding a new test, giving it a name and telling the build agent that the test is in the “Running” state.

Line 18 : We run PSScriptAnalyzer against all the files in the current directory, recursively. We specify the “Error” severity to output only the violations of level “Error“, because we don’t want a violation of severity “Information” or even “Warning” to make the test fail. We store the result in a variable for later use.

Line 20 : If there are any “errors” from PSScriptAnalyzer perspective, we want to display them as a message in the build console and in the error message of the “test”. That’s why we need to convert the output object(s) from PSScriptAnalyzer to a string.

Line 21 : Writing the violation(s) to the build console. We could use Write-Host or Write-Output as well but as we’ll see in a later screenshot, the warning stream makes it stand out more visibly.

Line 22 : This Appveyor-specific cmdlet adds a message to the build’s “Messages” tab. Specifying “Error” for the category just displays the message with a touch of red on its left :

Appveyor Message fail
 
Line 24 : Update-AppveyorTest is another cmdlet leveraging the Appveyor build worker API. Here, we are using it to update the status of our existing test and add an error message to it. This message is PSScriptAnalyzer output converted to a string, so we can check the test message to see exactly what the problem is :

Appveyor Test fail
 

Line 27 : We need to use “Throw” to explicitly fail the build. Otherwise, the build is considered as succeeded, even if the “test” fails.

Line 30 : If PSScriptAnalyzer didn’t output anything, meaning if there were no violation of the “Error” severity in any file scanned by PSScriptAnalyzer, we considered that our project passes the test. Again, we use Update-AppveyorTest but this time, we tell it that the outcome of the “test” is a pass.

Now, let’s see how this looks like when we run a build :

Appveyor Build success
 
Not much output, because all is well. Also, the test is green :

Appveyor Test Success
 
Do you like watching “Fail” videos on Youtube ? If yes, you are probably dying to see my build fail, right ? So, here we go :

Appveyor Build fail
 
Wow, the yellow background of the warning stream is not elegant but it sure stands out !

Also, if you want to see the “Passing” Appveyor badge on the GitHub repository, head over THERE.

This is it.
PSScriptAnalyzer is an important tool that any PowerShell scripter should use. Appveyor is awesome, so combining both of these tools is pretty powerful.

Documentation as Code : Exporting the contents of DSC MOF files to Excel

One of the greatest benefits of PowerShell DSC (and other Configuration Management tools/platforms) is the declarative syntax (as opposed to imperative scripting). Sure, a DSC configuration can contain some logic, using loops and conditional statements, but we don’t need to care about handling errors or checking if something is already present. All this (and the large majority of the logic) is handled within the resource, so we just need to describe the end result, the “Desired State”.

So all the settings and information that a configuration is made of are stored in a very simple (and pretty much human-readable) syntax, like :

Node $AllNodes.NodeName
    {
        cWindowsErrorReporting Disabled
        {
            State = "Disabled"
        }
    }

 
This allows us to use this “code” (for lack of a better word) as documentation in a way that wouldn’t be possible or practical with imperative code. For this purpose, we could use DSC configurations, or DSC configuration data files if all the configuration data is stored separately. But the best files for that would probably be the MOF files for 2 reasons :

  • Even if some settings are in different files, we can be sure that all the settings for a given node is in a single MOF file (the exception being partial configurations)
  • Even if the DSC configuration contains complex logic, there is no need to understand or parse this logic to get the end result. All this has been done for us when the MOF file has been generated

Now, imagine you have all your MOF files stored in a directory structure like this :

PS C:\> tree C:\DSCConfigs /F
Folder PATH listing for volume OS
C:\DSCCONFIGS
├───Customer A
│   ├───Dev
│   │       Server1.mof
│   │       Server2.mof
│   │
│   ├───Prod
│   │       Server1.mof
│   │       Server2.mof
│   │
│   └───QA
│           Server1.mof
│           Server2.mof
│
├───Customer B
│   ├───Dev
│   │       Server1.mof
│   │       Server2.mof
│   │
│   ├───Prod
│   │       Server1.mof
│   │       Server2.mof
│   │
│   └───QA
│           Server1.mof
│           Server2.mof
│
└───Customer C
    ├───Dev
    │       Server1.mof
    │       Server2.mof
    │
    ├───Prod
    │       Server1.mof
    │       Server2.mof
    │
    └───QA
            Server1.mof
            Server2.mof

You most likely have much more than 2 servers per environment, so there can easily be a large number a MOF files.
Then, imagine your boss tells you : “I need all the configuration settings, for all customers, all environments and all servers in an Excel spreadsheet to sort and group the data easily and to find out the differences between Dev and QA, and between QA and Prod”.

If you are like me, you may not quite understand bosses’ uncanny obsession with Excel but this definitely sounds like something useful and an interesting challenge. So, let’s do it.

We’ll divide this in 3 broad steps :

  • Converting the contents of MOF files to PowerShell objects
  • Exporting the resulting PowerShell objects to a CSV file
  • Processing the data using PowerShell and/or Excel

Converting the contents of MOF files to PowerShell objects

This is by far the most tricky part.
Fortunately, I wrote a function, called ConvertFrom-DscMof, which does exactly that. We won’t go into much details about how it works, but you can have a look at the code here.

Basically, it parses one or more MOF files and it outputs an object for each resource instance contained in the MOF file(s). All the properties of a given resource instance become properties of the corresponding object, plus a few properties related to the MOF file.

Here is an example with a very simple MOF file :

ConvertFrom-DscMofExample
 
And here is an example with all the properties of a single resource instance :

ConvertFrom-DscMofSingle
 

Exporting the resulting PowerShell objects to a CSV file

As we have the ability to get DSC configuration information in the form of PowerShell objects, it is now very easy to export all this information as CSV. But there’s a catch : different resources have different parameters, for example the Registry resource has the ValueName and ValueData parameters and the xTimeZone resource has a TimeZone parameter.

So the resulting resource instances objects will have ValueName and ValueData properties if they are an instance of the Registry resource and a TimeZone property if they are an instance of the xTimeZone resource. Even for a given resource, some parameters are optional and they will end up in the properties of the resulting PowerShell object only if they were explicitly specified in the configuration.

The problem is that Export-Csv doesn’t handle intelligently objects with different properties, it will just create the columns from the properties of the first object in the collection and apply that to all objects, even for objects which have different properties.

But, we can rely on the “ResourceID” property of each resource instance because it uniquely identify the resource instance. Also, it contains the name we gave to the resource block in the DSC configuration, which should be a nice meaningful name, right ?
Yeah, this is where “Documentation as code” meets “self-documenting code” : they are both important and very much complementary. To get an idea of what the values of ResourceID look like, refer back to the first screenshot.

Below, we can see how to export only the properties we need, and only the properties that we know will be present for all resource instances :


Get-ChildItem C:\MOFs\ -File -Filter "*.mof" -Recurse |
ConvertFrom-DscMof |
Select-Object -Property "MOF file Path","MOF Generation Date","Target Node","Resource ID","DSC Configuration Info","DSC Resource Module" |
Export-Csv -Path 'C:\DSCConfig Data\AllDSCConfigs.csv' -NoTypeInformation

 

Processing the data using PowerShell and/or Excel

The resulting CSV file can be readily opened and processed by Excel (or equivalent applications) :

CSVFileInExcel
 
Now, we have all the power of Excel at our fingertips, we can sort, filter, group all this data however we want.

Now, here is a very typical scenario : the Dev guys have tested their new build and it worked smoothly in their environment. However, the QA guys say that the same build is failing miserably in their environment. The first question which should come to mind is : “What is the difference between the Dev and QA environments ?

If all the configuration of these environments is done with PowerShell DSC, the ConvertFrom-DscMof function can be a great help to answer that very question :

C:\> $CustomerCDev = Get-ChildItem -File -Filter '*.mof' -Recurse 'C:\MOFs\Customer C\Dev\' |
ConvertFrom-DscMof
C:\> $CustomerCQA = Get-ChildItem -File -Filter '*.mof' -Recurse 'C:\MOFs\Customer C\QA\' |
ConvertFrom-DscMof
C:\> Compare-Object -ReferenceObject $CustomerCDev -DifferenceObject $CustomerCQA -Property 'Target Node','Resource ID'

Target Node Resource ID                    SideIndicator
----------- -----------                    -------------
Server1     [xRemoteFile]RabbitMQInstaller <=
Server1     [Package]RabbitMQ              <=

 
Oops, we forgot to install RabbitMQ on Server1 ! No wonder it’s not working in QA.
But now, there is hope. We, forgetful and naturally flawed human beings, can rely on this documentation automation to tell us how things really are.

So, as we have seen, Infrastructure-as-code (PowerShell DSC in this case) can be a nice stepping-stone for an infrastructure documentation.
What is the number 1 problem for any infrastructure/configuration documentation ?
Keeping it up-to-date. This can help generate dynamically the documentation, which means this documentation can be kept up-to-date pretty easily without any human intervention.