Difference between revisions of "Performance issues"

From PARS
Jump to navigationJump to search
Line 1: Line 1:
If PARS Connect seems to be running slowly then you should follow the instructions on this page. Each of the sections below describes the common causes and resolutions of speed issues. <br> <br>
+
The following page can be used to diagnose performance issues in PARS. Some potential solutions for common issues are listed on this page, but the information shown below is not a definitive list of the causes and solutions for performance issues. <br> <br>
  
If, having followed the sections on this page, you are still experiencing performance issues then please create a ticket on our [[Online Helpdesk]] and our technical team will look into the issue for you. Our technical team will need to see the results of the checks you have performed for each of the sections in this page, therefore you should export the results of each section as you go (instructions to do so are explained in the sections); these exports will be requested if you open a ticket on the helpdesk regarding performance issues. <br> <br>
+
At TASC we will try to help our customers wherever possible but it is important to note we are software specialists, not network managers or server engineers. <br> <br>
  
==Step 1 - Large SIMS .ldf file==
+
We have neither the remit nor the expertise to log on to your server and resolve performance issues for you. <br> <br>
  
Having a SIMS ldf file that is larger than 1Gb can cause performance issues. There is a report in PARS that displays the SIMS ldf file size, amongst other performance factors: <br>
+
In a well configured environment (comprising of the IIS server, the SQL server and the network) PARS will run quickly. Any performance issues are the result of issues in the environment, not the result of the PARS programming. <br> <br>
'''[[PARS main menu]]''' > '''[[System management]]''' > '''[[Management reports]]''' >  '''Performance''' > '''Performance''' <br> <br>
 
  
[[Image:performance1.jpg|800px]] <br> <br>
+
=Determining if PARS is Running Slowly=
  
If the ldf is more than 1Gb you should shrink it (instructions to do so can be found on Google) then review your backup policy; using simple recovery as the backup policy can prevent the ldf growing to an undesirable size. <br> <br>
+
[[Image:reg_load_times.jpg|thumb|PARS running on an Intel i5 SQL server with mechanical drives]]
  
'''<u>Exporting results</u>''' <br>
+
Hold down CTRL and SHIFT when opening a register from the main PARS Connect diary page. Any register will do. <br> <br>
When your ldf file is below 1Gb you should export the report from above, and save it as the result of this section. <br> <br>
 
  
==Step 2 - Automated tasks or reports running during busy times==
+
This will open a Trace Window when loading the register. The Trace Window records the exact times taken by various functions required to open a register. On a decent system a register should take less than 5 seconds to open. On a well-configured system a register will take less than 2 seconds to open. <br> <br>
  
Automated tasks or reports should not run during peak times as they will increase the load on the server. You can find a list of the automated jobs and their scheduled run times via: <br>
+
If the register takes longer than 5 seconds, examine the times in the Trace window to see what is taking a long time. You should repeat this a few times to get consistent/average timings. <br> <br>
'''[[PARS main menu]]''' > '''[[System management]]''' > '''[[Automation]]''' > '''[[Job list]]''' <br> <br>
 
  
[[Image:performance2.jpg|800px]] <br> <br>
+
[[Image:trace_exper_2.jpg|thumb|An example Trace Window for a real school during peak time]]
  
If any of your jobs are scheduled to run during busy times (typically 8am - 5pm) then you should reschedule them to run at a quieter time. Jobs can be rescheduled by clicking the magnifying glass button next to their title. These changes will not take effect until the following school day, so you will need to wait until then to observe the performance of PARS. <br> <br>
+
The "Exper method 2" should always take the longest amount of time. This is the process of retrieving attendance marks from SIMS. Nothing in either PARS or SIMS can be changed to improve this time; if Exper method 2 is taking too long then reconfiguring the system is the only way to improve performance. <br> <br>
  
'''<u>Exporting results</u>''' <br>
+
Exper Method 2 should take less than 3 seconds; if it does not, follow the instructions below. If Exper Method 2 take less than 3 seconds then check the other times in the Trace Window. Several times highlighted in orange or red indicates an issue with the SQL server. <br> <br>
Go to: <br>
 
'''[[PARS main menu]]''' > '''[[System management]]''' > '''[[Automation]]''' > '''Status''' <br> <br>
 
  
Click the Export button at the top left of the page and save the file as the result of this section. <br> <br>
+
If the register is taking more than 5 seconds to load, Exper Method 2 is taking less than 3 seconds and one or two times in the Trace Window are highlighted in orange or red, post a ticket on the [[help|helpdesk]] with a screenshot of the Trace Window. <br>
 +
'''In any other case, the root cause of performance issues in PARS lies with either your IIS server or your SQL server (or both).''' <br> <br>
  
==Step 3 - The IIS server==
+
=Identifying the Issue=
  
There is a feature in PARS called the performance monitor which will show you the SQL and IIS server resources and an overview of how the IIS server is coping. To access this, go to: <br>
+
In general terms any performance issues will be lie with either the IIS server (due to overuse) or with the SQL server. Use this page to decide whether it is the IIS server or SQL server causing the issue: <br>
'''[[PARS main menu]]''' > '''[[System management]]''' > '''Performance monitor''' <br> <br>
+
'''[[PARS main menu]]''' > '''[[System management]]''' > '''System''' > '''Performance monitor''' <br> <br>
  
If you see an error on this page, on the IIS server open an elevated CMD Prompt and type: LODCTR /R to restore the corrupt performance indicators.
+
==IIS==
  
[[Image:performance3.jpg|800px]] <br> <br>
+
The easiest to rule out (and to fix) is the IIS server. <br> <br>
  
 +
The main indicator to IIS throughput is the Processor Queue Length. This should read 0 nearly all of the time, occasionally flicking up to 1 or 2. A Processor Queue Length consistently higher than 0 is a problem. <br> <br>
  
#The SQL server's resources should at least match those recommended for running SIMS. The Web server's resources should at least match those recommended on our [[IIS_recommended_specs|minimum specifications]] page. If either of the servers are not up to spec, upgrade them as necessary.
+
[[Image:processor_queue.jpg]] <br> <br>
#Check the performance monitor during the times of day where you experience performance issues. All four of the LEDs should be green. If they are not, address any issues according to the performance monitor page. <br> <br>
 
  
'''<u>Exporting results</u>''' <br>
+
All four LEDs should be green. If any are yellow or red, there is a problem on the IIS server which must be resolved. You may not have enough memory, enough cores that are fast enough to cope, a failing or slow disk drive that cannot keep up or a poorly performing Network Card. <br> <br>
Take a screenshot of the performance monitor during the time of day where you experience performance issues. The screenshot must include the SQL server resources, the Web server resources and the CPU performance section. <br> <br>
 
  
==Step 4 - Table speed test times==
+
Running the IIS server as a Virtual Machine can cause additional problems; Power Saving features can be left on and there may be settings on the Host to limit CPU performance to a percentage of the host's capability (nominally to maintain CPU cycles for other VM guests) that will have the effect of crippling the IIS server's performance. <br> <br>
  
There is a report in PARS that test the SQL server's ability to update and delete rows in the SQL database. To access this report go to: <br>
+
==SQL==
'''[[PARS main menu]]''' > '''[[System management]]''' > '''[[Management reports]]''' > '''Management''' > '''Performance''' > '''Table speed test''' <br> <br>
 
  
[[Image:performance4.jpg]] <br> <br>
+
If the IIS server is running well turn your attention the the SQL server. This will have the same potential pitfalls and more when running as a virtualised machine and sharing the host with the IIS server. <br> <br>
  
Run this report (which will take a few minutes), the updating and deleting sections shown in the report should each be around 2 minutes. If the times shown are longer than this then you need to address your SQL server. The following is a list of suggestions to improve the performance of your SQL server; you do not need to perform all of these suggestions as long as you can reduce the updating and deleting times to around 2 minutes each. <br> <br>
+
SQL bottlenecks will be either CPU, Memory or Disk <br> <br>
  
*Turn off antivirus on the SQL server
+
If your CPU graph on the SQL tab is maxed out nearly all the time CPU is probably the limiter. Increasing the number of cores or upgrading the existing cores will solve this issue. <br> <br>
*Install the latest Virtual Machine NIC drivers
+
 
*Use the BIOS to turn off Power Management for guest and host Operating Systems
+
If there is a high drive level latency, Disk access is probably the limiter. Check that:
*Host the SIMS ldf and mdf files on separate physical drives (not on separate partitions of the same physical drive)
+
*Your SIMS database is running at the right compatibility level
*Use RAID 10 to maintain the SIMS ldf file over two separate drives. This requires specialist hardware and may be costly - you should examine the alternatives above first <br> <br>
+
*There is enough free space and that your log files are not being overused (On screen warnings about long wait times are a very bad sign)
 +
*Close and Shrink are not enabled <br> <br>
 +
 
 +
If the LED for memory is green then available memory is unlikely to be an issue. If the LED is not green, increase the memory available to the SQL server. <br> <br>
 +
 
 +
A very realistic test of the SQL server can be performed using the following: <br>
 +
'''[[PARS main menu]]''' > '''[[System management]]''' > '''[[Management reports]]''' > '''Table speed test''' <br> <br>
 +
 
 +
This is a stress test that attempts to write a number of records to the SIMS and PARS databases, update them and then delete them. This will give a baseline figure as to the performance of the SQL server. You should expect to get around 2 minutes each for the longest PARS tests and 3 minutes each for the longest SIMS tests. If they are much longer than expected, your bottleneck is with the SQL server. <br> <br>
 +
 
 +
=Potential Resolutions=
 +
 
 +
===Automation===
 +
 
 +
Automated tasks or reports may increase the load on the SQL server. You can find a list of the automated jobs and their scheduled run times via: <br>
 +
'''[[PARS main menu]]''' > '''[[System management]]''' > '''System''' > '''[[Automation]]''' > '''[[Job list]]''' <br> <br>
 +
 
 +
If any of your jobs are scheduled to run during busy times (typically 8am - 5pm) then consider rescheduling them to run at a quieter time. Jobs can be rescheduled by clicking the magnifying glass button next to their title. These changes will not take effect until the following school day, so you will need to wait until then to observe the performance of PARS. <br> <br>
 +
 
 +
Automation should also build data caches so that end users do not have to. To test that automation is building the caches go to: <br>
 +
'''[[PARS main menu]]''' > '''[[System management]]''' > '''System''' > '''[[Cache viewer]]''' <br> <br>
  
'''<u>Exporting results</u>''' <br>
+
Each of the PARS caches is listed on this page. Some of the caches should have a yellow background, indicating they were built by the automation module. If none of the caches have yellow backgrounds, check that automation is running by going to: <br>
Run the table speed test report at 5 times during the day when you are experiencing performance issues. Export the report each time. The updating and deleting sections of the report should not be much higher than 2 minutes each during any of the 5 runs. <br> <br>
+
'''[[PARS main menu]]''' > '''[[System management]]''' > '''System''' > '''[[Automation]]''' > '''Status''' <br> <br>
  
==Step 5 - Register loading times==
+
If automation is running but not building data caches, run a [[service pack|PARS Service Pack]]. <br> <br>
  
The time taken to open a register can provide meaningful information. For the next steps you will need to record the amount of time taken to open a register during the time that you are experiencing performance issues. To open the register, go to: <br>
+
===Blockers===
'''[[PARS main menu]]''' > '''[[Attendance]]''' > '''[[Take a register]]''' <br> <br>
 
  
Single click on any register then click the link at the top of the page to open the register, and record how long this takes. Once the register opens, compare the time you recorded to the Page gen time shown at the top right of the page. <br> <br>
+
Sometimes poorly written or recursive SIMS User Defined Reports, amongst other things, will require access to SIMS datatables for a long time which denies access to those datatables for other users. Other users are then forced to 'queue' until the datatable is unengaged. The blockers tab of the performance monitor page will show whether this is happening: <br>
  
[[Image:performance5.jpg|800px]] <br> <br>
+
'''[[PARS main menu]]''' > '''[[System management]]''' > '''System''' > '''Performance monitor''' <br> <br>
  
====If there is a large discrepancy between the times and the page gen time is low (approx. 2 seconds)====
+
''Note than some unobtrusive blocking is normal, so use your judgement as to how frequently and for how long blocking occurs and from which workstation and products, and whether this is a root cause or a symptom.'' <br> <br>
  
This means there is a problem with your network, which is not something we are able to fix. You may be able to diagnose the issues by pressing the back button on your browser to go to back to the Take a register page. Press F12 on your keyboard to open the Developer Tools window and go to the Network tab. Now open the same register again. The Developer window will show a breakdown of the times taken to perform each stage of loading the register. <br> <br>
+
===PARS Tweaks===
  
====If the times are both large both large====
+
Some tweaks in PARS itself may be possible to improve performance but these are likely to only affect certain areas of the system, such as saving behaviour incidents. The following report will indicate the settings in PARS that may affect performance: <br>
 +
'''[[PARS main menu]]''' > '''[[System management]]''' > '''[[Management reports]]''' > '''Performance''' <br> <br>
  
You should perform the following steps 5 times: append the following to the register's URL and refresh the page: <br>
+
===SQL Server===
&trace=yes
 
A window will open showing the times taken for each step of loading a register. Use the farthest right column to examine the times taken for each step. Take a note of which of these steps are taking a long time. <br> <br>
 
  
:<div style="font-size:11pt">'''If the same steps are always taking a long time'''</div>
+
This is not a definitive list of potential issues on the SQL server, but many of the issues listed below have been observed as the cause of performance issues in live school environments. <br> <br>
::This means a specific function in PARS is taking a long time. Our technicians will investigate this for you. Please create a ticket on the [[Online Helpdesk]], attaching all of the results from the previous steps on this page and a screenshot of the trace window showing the step(s) that is taking a long time. <br> <br>
 
  
:<div style="font-size:11pt">'''If different steps are taking a long time'''</div>
+
*Running antivirus on the SIMS MDF and LDF data files
::This means that some database requests are being delayed in a queue. Go to: <br>
+
*Huge LDF files from not backing up and shrinking them
::'''[[PARS main menu]]''' > '''[[System management]]''' > '''Performance monitor''' <br> <br>
+
*Power Management settings on the VM Host powering down the Host CPUs to x% to save power
 +
*Limiting VM CPU to tiny percentages of the host
 +
*Adding more cores to VM Guests than the Host actually comprises
 +
*Limiting the SQL server to run on a single CPU with Processor Affinities
 +
*Using the default Microsoft NIC driver on a VM rather than the correct manufacturer's driver
 +
*Having SQL and IIS on the same physical server
 +
*Having the SIMS MDF and LDF files on the OS volume
 +
*Having the SIMS MDF and LDF files on a fast SSD, but leaving the frequently used TempDB on a slow mechanical drive
 +
*Not performing routine housekeeping such as archiving SIMS attendance data or scheduled Database re-indexes
 +
*Third party tools running constantly against the SIMS database or running during peak times
 +
*Having RAID set to WRITE THROUGH and not WRITE BACK <br> <br>
  
::Click on the Blockers tab to identify specific database requests that are holding other users up. Some blocking is normal, but if you see one command with high values in the Number blocking and Ticks columns, then that will likely be the issue. Click the Export blocker table button at the top of the performance monitor page and create a ticket on the [[Online Helpdesk]], attaching all of the results from the previous steps on this page and the export from the performance monitor page. <br> <br>
+
The following configurations may positively affect the performance of the SQL server:
 +
*Host the SIMS ldf and mdf files on separate physical drives (not on separate partitions of the same physical drive)
 +
*Use RAID 10 to maintain the SIMS ldf file over two separate drives. This requires specialist hardware and may be costly <br> <br>
  
 +
[[Category:System management]]
 
[[Category:Technical]]
 
[[Category:Technical]]
 
[[Category:Troubleshooting]]
 
[[Category:Troubleshooting]]

Revision as of 15:41, 23 February 2017

The following page can be used to diagnose performance issues in PARS. Some potential solutions for common issues are listed on this page, but the information shown below is not a definitive list of the causes and solutions for performance issues.

At TASC we will try to help our customers wherever possible but it is important to note we are software specialists, not network managers or server engineers.

We have neither the remit nor the expertise to log on to your server and resolve performance issues for you.

In a well configured environment (comprising of the IIS server, the SQL server and the network) PARS will run quickly. Any performance issues are the result of issues in the environment, not the result of the PARS programming.

Determining if PARS is Running Slowly

PARS running on an Intel i5 SQL server with mechanical drives

Hold down CTRL and SHIFT when opening a register from the main PARS Connect diary page. Any register will do.

This will open a Trace Window when loading the register. The Trace Window records the exact times taken by various functions required to open a register. On a decent system a register should take less than 5 seconds to open. On a well-configured system a register will take less than 2 seconds to open.

If the register takes longer than 5 seconds, examine the times in the Trace window to see what is taking a long time. You should repeat this a few times to get consistent/average timings.

An example Trace Window for a real school during peak time

The "Exper method 2" should always take the longest amount of time. This is the process of retrieving attendance marks from SIMS. Nothing in either PARS or SIMS can be changed to improve this time; if Exper method 2 is taking too long then reconfiguring the system is the only way to improve performance.

Exper Method 2 should take less than 3 seconds; if it does not, follow the instructions below. If Exper Method 2 take less than 3 seconds then check the other times in the Trace Window. Several times highlighted in orange or red indicates an issue with the SQL server.

If the register is taking more than 5 seconds to load, Exper Method 2 is taking less than 3 seconds and one or two times in the Trace Window are highlighted in orange or red, post a ticket on the helpdesk with a screenshot of the Trace Window.
In any other case, the root cause of performance issues in PARS lies with either your IIS server or your SQL server (or both).

Identifying the Issue

In general terms any performance issues will be lie with either the IIS server (due to overuse) or with the SQL server. Use this page to decide whether it is the IIS server or SQL server causing the issue:
PARS main menu > System management > System > Performance monitor

IIS

The easiest to rule out (and to fix) is the IIS server.

The main indicator to IIS throughput is the Processor Queue Length. This should read 0 nearly all of the time, occasionally flicking up to 1 or 2. A Processor Queue Length consistently higher than 0 is a problem.

Processor queue.jpg

All four LEDs should be green. If any are yellow or red, there is a problem on the IIS server which must be resolved. You may not have enough memory, enough cores that are fast enough to cope, a failing or slow disk drive that cannot keep up or a poorly performing Network Card.

Running the IIS server as a Virtual Machine can cause additional problems; Power Saving features can be left on and there may be settings on the Host to limit CPU performance to a percentage of the host's capability (nominally to maintain CPU cycles for other VM guests) that will have the effect of crippling the IIS server's performance.

SQL

If the IIS server is running well turn your attention the the SQL server. This will have the same potential pitfalls and more when running as a virtualised machine and sharing the host with the IIS server.

SQL bottlenecks will be either CPU, Memory or Disk

If your CPU graph on the SQL tab is maxed out nearly all the time CPU is probably the limiter. Increasing the number of cores or upgrading the existing cores will solve this issue.

If there is a high drive level latency, Disk access is probably the limiter. Check that:

  • Your SIMS database is running at the right compatibility level
  • There is enough free space and that your log files are not being overused (On screen warnings about long wait times are a very bad sign)
  • Close and Shrink are not enabled

If the LED for memory is green then available memory is unlikely to be an issue. If the LED is not green, increase the memory available to the SQL server.

A very realistic test of the SQL server can be performed using the following:
PARS main menu > System management > Management reports > Table speed test

This is a stress test that attempts to write a number of records to the SIMS and PARS databases, update them and then delete them. This will give a baseline figure as to the performance of the SQL server. You should expect to get around 2 minutes each for the longest PARS tests and 3 minutes each for the longest SIMS tests. If they are much longer than expected, your bottleneck is with the SQL server.

Potential Resolutions

Automation

Automated tasks or reports may increase the load on the SQL server. You can find a list of the automated jobs and their scheduled run times via:
PARS main menu > System management > System > Automation > Job list

If any of your jobs are scheduled to run during busy times (typically 8am - 5pm) then consider rescheduling them to run at a quieter time. Jobs can be rescheduled by clicking the magnifying glass button next to their title. These changes will not take effect until the following school day, so you will need to wait until then to observe the performance of PARS.

Automation should also build data caches so that end users do not have to. To test that automation is building the caches go to:
PARS main menu > System management > System > Cache viewer

Each of the PARS caches is listed on this page. Some of the caches should have a yellow background, indicating they were built by the automation module. If none of the caches have yellow backgrounds, check that automation is running by going to:
PARS main menu > System management > System > Automation > Status

If automation is running but not building data caches, run a PARS Service Pack.

Blockers

Sometimes poorly written or recursive SIMS User Defined Reports, amongst other things, will require access to SIMS datatables for a long time which denies access to those datatables for other users. Other users are then forced to 'queue' until the datatable is unengaged. The blockers tab of the performance monitor page will show whether this is happening:

PARS main menu > System management > System > Performance monitor

Note than some unobtrusive blocking is normal, so use your judgement as to how frequently and for how long blocking occurs and from which workstation and products, and whether this is a root cause or a symptom.

PARS Tweaks

Some tweaks in PARS itself may be possible to improve performance but these are likely to only affect certain areas of the system, such as saving behaviour incidents. The following report will indicate the settings in PARS that may affect performance:
PARS main menu > System management > Management reports > Performance

SQL Server

This is not a definitive list of potential issues on the SQL server, but many of the issues listed below have been observed as the cause of performance issues in live school environments.

  • Running antivirus on the SIMS MDF and LDF data files
  • Huge LDF files from not backing up and shrinking them
  • Power Management settings on the VM Host powering down the Host CPUs to x% to save power
  • Limiting VM CPU to tiny percentages of the host
  • Adding more cores to VM Guests than the Host actually comprises
  • Limiting the SQL server to run on a single CPU with Processor Affinities
  • Using the default Microsoft NIC driver on a VM rather than the correct manufacturer's driver
  • Having SQL and IIS on the same physical server
  • Having the SIMS MDF and LDF files on the OS volume
  • Having the SIMS MDF and LDF files on a fast SSD, but leaving the frequently used TempDB on a slow mechanical drive
  • Not performing routine housekeeping such as archiving SIMS attendance data or scheduled Database re-indexes
  • Third party tools running constantly against the SIMS database or running during peak times
  • Having RAID set to WRITE THROUGH and not WRITE BACK

The following configurations may positively affect the performance of the SQL server:

  • Host the SIMS ldf and mdf files on separate physical drives (not on separate partitions of the same physical drive)
  • Use RAID 10 to maintain the SIMS ldf file over two separate drives. This requires specialist hardware and may be costly