Testing

webTiger Logo Wide

Debugging SharePoint Add-ins Stuck On Installing/Removing Or Failing To Upgrade

SharePoint 2013 Logo

There can be times when a SharePoint Add-in instance just won’t install (or uninstall). You get the familiar “We’re adding your app” message on the new tile that has been added to the Site Contents page but nothing changes – it just stays like that.

There can be many reasons for these kinds of issues, but if it is happening on all Add-in instances you are trying to deploy or remove, then it could be because the necessary services aren’t running.

The first thing to check is that the ‘App Installation Service’ timer job is running (or has run recently). You’ll need to access SharePoint’s Central Administration site to do this, and then go to Monitoring, and then choose the ‘Review Job Definitions’ link under the Timer Jobs section.

SP CentralAdmin View Timer Jobs

The ‘App Installation Service’ timer job should be one of the first ones in the list and should run fairly regularly. Click on the timer job name to view details about it. Look at the last run time under Job Properties – it if hasn’t run recently (within the last few minutes) try getting it to run immediately by clicking the ‘Run Now’ button. Wait a few seconds and check it’s last run status again. If it has just run, go back to the site where your Add-in instance was being installed and the app should now install.

If the Add-in still doesn’t install, then check to make sure the farm’s timer services are online using the following PowerShell script commands in a SharePoint Management Shell (SPMS) session:

$farm = Get-SPFarm
$farm.TimerService.InstancesCode language: PowerShell (powershell)

This should list the timer service instances configured on the farm and their status, and you may have one or more of them depending on your farm configuration. If any of them aren’t showing a status of Online, this could be your problem.

Here’s a handy script that can be used to check (or check and auto-fix) troublesome timer service instances:

# Set the 'AutoFix' variable to false to check status, or to true to check status 
# and attempt to bring any disabled services online.
$AutoFix = $false 

$farm = Get-SPFarm
$allInstances = $farm.TimerService.Instances
$disabledInstances = $allInstances | where {$_.Status -ne "Online"}
if ($disabledInstances -eq $null -or $disabledInstances.Count -eq 0) 
{ 
    Write-Host "All ($($allInstances.Count)) timer instances are online!" -ForegroundColor Green 
    exit
}

foreach ($instance in $disabledInstances)
{
    Write-Host "Timer service instance on $($instance.Server.Name) is NOT online (status: $($instance.Status))." -ForegroundColor Magenta
    if ($AutoFix -eq $true)
    {
        Write-Host "Attempting to start the service instance... " -NoNewline
        $timer.Status = [Microsoft.SharePoint.Administration.SPObjectStatus]::Online
        $timer.Update()
           
        if ($timer.Status -eq "Online")
        {
            Write-Host "Success!" -ForegroundColor Green
        }
        else 
        {
            Write-Host "FAILED!" -ForegroundColor Magenta
        }
    }
}Code language: PowerShell (powershell)

If any timer service instances were not online and needed to be started, restart the SharePoint Administration service and SharePoint Timer service via the Services.msc in the operating systems administration tools on each server, and then perform IIS resets.

Check to see if your Add-in instance is installing now. It may take a few minutes as there may be a queue of timer jobs being processed, not just your add-in installation.

If you Add-in still isn’t installing, it could be that another timer job is not running and stopping the queue clearing down. You can check this with some more PowerShell run in a SPMS session:

Get-SPTimerJob  | ? { $_.Schedule.Description -eq "One-time" } | select Name,{$_.Schedule.Description},{$_.Schedule.Time},LastRunTime,DisplayName | Format-Table -AutoSizeCode language: PowerShell (powershell)

This will display the one-off timer jobs that are currently queued up to be processed. In my experience, the ‘One-time’ jobs are those most likely to be causing issues and it means a much reduced list of jobs to sift through compared to the full list of jobs queued up/running on the farm.

Review any jobs, looking for ones that could be stuck and not executing by reviewing their schedule date/time and last run date/time column values. Before taking more drastic measures (like deleting jobs), try simply running them again, with the following PowerShell commands (in a SPMS session):

$job = Get-SPTimerJob -Identity "timer-job-guid-here"
$job.RunNow()Code language: PowerShell (powershell)

The ‘timer-job-guid-here’ is the value in the Id column from the earlier query for one-time timer jobs.

In most cases these jobs will now run through and clear down the queue. You can keep checking the one-time timer jobs queue using the earlier query. If, however, the queue is not emptying then those timer jobs may need to be deleted. Many timer jobs can be deleted without adversely affecting the farm, but take some time to work out what they are for and make sure you aren’t deleting any required for key maintenance activities, etc.

Once you are satisfied that you can delete any troublesome jobs without undue effects on the environment, you can do so using the following PowerShell commands (in a SPMS session), where necessary:

$job = Get-SPTimerJob -Identity "timer-job-guid-here"
$job.Delete()Code language: PowerShell (powershell)

Hopefully once you’ve cleared down the queue your Add-in instance will finally install successfully.

If it still won’t install there might be a problem with your Add-in package or wider problems with the app catalog or the Add-in hosting environment that has been set up, but these issues and investigations are outside the scope of this article.

UPDATE: I recently came across a ‘gotcha’ where an Add-in wouldn’t upgrade to a newer version and thought it warranted documenting. The error message logged in the ‘DETAILS’ section was ‘User or group 123 cannot be resolved’. To check for install/upgrade errors, you simply click the ellipsis (…) next to the Add-in icon on the Site Contents page and then choose the DETAILS option.

SP Addin View Details

On the DETAILS page, look for the Errors section and then Upgrade Errors should have at least one error logged. Click on the number to view the error details.

SP Addin Failed App Install Error Details

You should see something similar to above (but obviously with a slightly different error message that states something like “There was a problem accessing the file system on the server. Details: User or group 123 cannot be resolved”. This is usually caused by SharePoint groups being re-added on the new app web.

Add-ins are upgraded by making the current instance read-only, making a new site object, copying everything across to the new app web site, and finally re-pointing the URLs.

Users and groups have to be added into the new site and if one or more SharePoint groups have an owner that is another one of those groups, this is where the issue can occur. It happens because the groups may not be added in the correct order.

The best way to fix it is to temporarily change groups the site uses to have an owner that is a specific user and then change everything back after the upgrade has completed.

Another, less likely, issue is if a user field (Person or Group field) in a list item has somehow become corrupted or a user record has been deleted from the content database. (The second scenario is very unlikely in production environments as, even when a user is deleted via the UI or PowerShell, the record is just marked as deleted – no actual record normally gets removed).

If you suspect a user field in a list to be causing the problem, then you’ll need to use the Correlation ID quoted in the error details, and filter SharePoint’s ULS logs on that value. It should return a set of entries relating to the Add-in upgrade process and you’ll find the extended error information you require in most cases.

The one case I have observed where this was the issue was on an unsupported development environment where the database had been manually hacked by someone, and it led to an error in the log like this: “[ListItem] [543_.000] User or group 123 cannot be resolved.”

By going to the list and checking the item (ID=543), I discovered the ‘Person or Group’ field was indeed not displaying information as expected and I just edited and re-saved the item which cleared the fault.