Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.



Horizontal Navigation Bar
idQNP


Horizontal Navigation Bar Page
titleAbout

What is QNOD?

QNOD stands for QualityNet Operations Dashboard. QNOD is a single pane of glass into the health of the entire CCSQ ecosystem. Capabilities include predictive analytics and historical performance. It is the preferred method to share the state of your service with CMS and the rest of the community.

How does it work

QNOD integrates data from multiple reporting sources then displays system health and availability information about each CCSQ service based on this data. The data included for each service are the KPIs that the service owner determined were important to understand whether their service was running as expected. Service owners can get the most of QNOD by ensuring that the data they provide is what is most important for understanding how their service is performing.

QNOD currently pulls in data from:

  • New Relic
  • Splunk
  • AWS CloudWatch

QNOD takes in values from multiple monitoring tools and applications and brings it into one unified view to make it easy for people to understand how a service is performing. This means that we can add new monitoring tools to QNOD in the future without losing the historical view.



Login at https://qnetdashboard.cms.gov/


Horizontal Navigation Bar Page
titleGetting Started

Getting Started



Step 1: If you do not yet have a HARP account, please register for a HARP ID. For instructions on the HARP registration process, refer to the HARP page.

Step 2: Once the HARP account has been created, log into HARP and request a "Service" entitlement via a HARP User Role. 

    • Select User Roles from the top of the page, and select Request a Role.
    • Select QualityNet Operations Dashboard
    • Select your Organization.
    • Select the following user role: 
      • Viewer

Step 3: The organization's Security Official reviews and approves/denies the user role request. You will be notified via email that your request has been submitted, and again when your role has been approved or denied.

Step 4: Log into QualityNet Operations Dashboard using your HARP credentials.


Horizontal Navigation Bar Page
titleResources


Environments:


Consumer Onboarding Info

Developer Info


Horizontal Navigation Bar Page
titleWhat is Machine Learning

What is machine learning and how can it help you?

Machine learning is a type of artificial intelligence that uses computer algorithms to automatically discover patterns in data. QNOD’s machine learning algorithms learn how CCSQ services work from the historical data provided by the service owners. What this means is that the quality of the data provided to QNOD is essential for getting quality machine learning results. These learnings can be used to make predictions about how services will perform in the future.

QNOD currently includes two types of machine learning models:

  • Anomaly detection
  • Predictive analytics

Anomaly detection

An anomaly does not mean that something is wrong with a service. An anomaly simply means that our model observed values outside of the typical range. A typical value range for a service comes from the historical values of the KPIs that the service owner shared with the QNOD team.

Predictive analytics

Our machine learning models review data for a service over a period of time. The model uses this data to determine what would be an expected value for a KPI based on what values were seen in the past. QNOD displays what the expected values will be in the future.

QNOD’s machine learning engineers are also investigating other types of machine learning including Deep Reinforcement Learning and Natural Language Processing (NLP) models.

Interested in implementing machine learning on your services? Contact the ML team.



Horizontal Navigation Bar Page
titleFAQs

FAQs



Panel
borderColor#254b78
titleColor#ffffff
borderWidth1
titleBGColor#254b78
borderStylesolid
titleGeneral


Expand
titleHow can you make sure your service’s information is accurate?

Service owners can keep the QNOD team informed when they make changes to their monitoring. When data feeds change, services may display as down if the feed details have not been updated in QNOD. This situation can occur with:

  • Host name changes
  • AWS tag and environment updates
  • New monitoring tools added
  • Any other changes to your data feeds


Expand
titleWhat is Grafana?

Grafana is a multi-platform open source analytics and interactive visualization web application. It provides charts, graphs, and alerts for the web when connected to supported data sources.  Please visit https://grafana.com/ for more information. 


Expand
titleDon't see expected dashboard(s)?

Please contact us at #help-qnod-dashboard




Panel
borderColor#254b78
titleColor#ffffff
borderWidth1
titleBGColor#254b78
borderStylesolid
titleAccess


Expand
titleHow do I register for a HARP account?

For instructions on the HARP registration process, refer to the HARP page.


Expand
titleHow do I request access to QualityNet Operations Dashboard?

Users must register for a HARP ID. Once the HARP account has been created, log into HARP and request the QualityNet Operations Dashboard role (See "Requesting a User Role" process above).


Expand
titleHow do I log into QualityNet Operations Dashboard?

Login at https://idm.cms.gov/ using your HARP credentials. Select QualityNet Operations Dashboard after logging in.




Panel
borderColor#254b78
titleColor#ffffff
borderWidth1
titleBGColor#254b78
borderStylesolid
titlePlugins


Expand
titleWhat are Plugins?

Grafana plugins are either Panel (visualizations), Data source (communicate with external sources of data), or App (application monitoring). You can also choose to build your own plugin.  For more information please visit: https://grafana.com/docs/grafana/latest/plugins/


Expand
titleWhat Grafana plugins are currently installed?
  • Alert List 
  • Azure Monitor
  • Bar gauge
  • Blendstat
  • CloudWatch
  • D3 Gauge
  • Dashboard list
  • Diagram
  • Discrete
  • Dynamic text
  • Elasticsearch
  • FlowCharting
  • Gauge
  • Google Cloud Monitoring
  • Graph
  • Graphite
  • Heatmap
  • InfluxDB
  • Jaeger
  • Logs
  • Loki
  • Microsoft SQL Server
  • MySQL
  • New Relic
  • News
  • Node Graph
  • OpenTSDB
  • Pie chart v2
  • Polystat
  • PostgreSQL
  • Prometheus
  • Singlestat
  • Stat
  • Status Panel
  • Table
  • Table (old)
  • Tempo
  • TestData DB
  • Text
  • Time series
  • Zipkin
  • ePict Panel


Expand
titleWhere can I find a list of available Grafana plugins?

For a list of available plugins, please visit https://grafana.com/grafana/plugins/


Expand
titleHow do I request a new plugin?

Please contact us at #help-qnod-dashboard




Excerpt


Horizontal Navigation Bar Page
titleRelease Notes


Tabs Container
directionvertical


Tabs Page
titleApril 12, 2023

QualityNet Operations Dashboard v1.23.1.6

Affected customers: All QualityNet users.

On Wednesday, April 12, 2023, at 8 p.m. ET, we will be releasing a new version of the QualityNet Operations Dashboard (QNOD). This release will cause QNOD to be unavailable until 10 p.m. ET.

Why is this release happening?

New versions of QNOD are released each sprint to ensure that QNOD users have continued access to the latest product features and enhancements as well as fixes to any issues that have been resolved.

What are some of the enhancements included in the upgrade?

Users can expect the following new functionality with this release as well as fixes and/or security patching.

New features

  • Machine Learning Enablement – Uptime Prediction
    • QTSO
      • This AI/ML capability with deep learning models provides the service owner with a look ahead of 10 minutes into the future for any potential issues at the KPI level, with an 86% confidence level in this prediction. This gives service owners an opportunity to investigate their service and look for potential issues.
      • Predictions are available for the following KPIs: RDS Freeable Disk, RDS Freeable Memory.

Service Adjustments and Baseline Reconciliation Improvements

  • FAS
    • Removed non-impacting KPIs and updated component and KPI weights to more accurately represent service health.
  • New Relic
    • Updated KPI and component weights to more accurately represent service health
  • McAfee Web Gateway
    • Added additional Synthetic Monitor and removed non-impacting KPIs to more accurately represent service health.
  • EQRS Portal
    • Added additional KPIs and removed non-impacting KPIs to more accurately represent service health.
  • QTSO
    • Removed non-impacting KPIs and updated component and KPI weights to more accurately represent service health.
  • iQIES & QIES
    • Moved monitoring of MDS application infrastructure from QIES to iQIES in preparation for migration cutover on April 17.

Resolved issues

  • Machine Learning Models – Shifting trends in service metric data may adversely affect the performance of both Anomaly Detection (AD) and Uptime Prediction (UP) models. Retraining is complete for the listed models and AD alerts have been re-enabled where applicable.
    • ClamAV AD
    • Confluence AD
    • QTSO AD
    • ServiceNow AD

Known issues

  • CCSQ QuickSight – Occasional insufficient data false positives for User Experience KPIs
  • EQRS SF – No data is available for User Experience KPIs (“canaries”) while the Portal team troubleshoots API endpoints. Service health alerts have been disabled for EQRS SF.
  • QNOD – Time picker for service drill-down dashboards has been temporarily disabled to address possible issues with system stability.


Tabs Page
titleMarch 29, 2023

QualityNet Operations Dashboard v1.23.1.5

Affected customers: All QualityNet users.

On Wednesday, March 29, 2023, at 8 p.m. ET, we will be releasing a new version of the QualityNet Operations Dashboard (QNOD). This release will cause QNOD to be unavailable until 10 p.m. ET.

Why is this release happening?

New versions of QNOD are released each sprint to ensure that QNOD users have continued access to the latest product features and enhancements as well as fixes to any issues that have been resolved.

What are some of the enhancements included in the upgrade?

Users can expect the following new functionality with this release as well as fixes and/or security patching.

New features

  • Machine Learning Enablement – Anomaly Detection
    • QTSO
      • This AI/ML capability with deep learning models provides service owners with a 24-hour historical view of anomalies that may have occurred with their service.
      • This capability will aid service owners in investigating root causes as well as fixing issues with their service that would otherwise lead to a potential future service issue or degradation.

Service Adjustments and Baseline Reconciliation Improvements

  • HQR
    • Removed non-impacting KPIs and updated component and KPI weights to represent their contribution to service health more accurately.
  • QDIVS DARRT
    • Updated component weights to represent service health more accurately.
  • DEL
    • Removed non-impacting KPIs and updated KPI weights to represent their contribution to service health more accurately.
  • F5
    • Added KPIs that were considered critical to overall service health.
  • Presentation Zone
    • Decommissioned this service as it was a duplicate of the F5 service.

Resolved issues

  • iQIES Improved New Relic queries to avoid occasional timeouts when fetching metrics.
  • Machine Learning Models – Shifting trends in service metric data may adversely affect the performance of both Anomaly Detection (AD) and Uptime Prediction (UP) models. Retraining is complete for the listed models and AD alerts have been re-enabled where applicable.
    • Confluence AD
    • Jira Anomaly D
    • ServiceNow AD
    • Syslog AD.

Known issues

  • ClamAV Anomaly Detection (AD) Machine Learning Model – Shifting trends in service metric data may adversely affect the performance of both Anomaly Detection (AD) and Uptime Prediction (UP) models. AD alerts have been disabled while this model is monitored and retrained.
  • CCSQ QuickSight – Occasional insufficient data false positives for User Experience KPIs
  • EQRS SF – No data is available for User Experience KPIs (“canaries”) while the Portal team troubleshoots API endpoints. Service health alerts have been disabled for EQRS SF.
  • QNOD – Time picker for service drill-down dashboards has been temporarily disabled to address possible issues with system stability.


Tabs Page
titleMarch 15, 2023

QualityNet Operations Dashboard v1.23.1.4

Affected customers: All QualityNet users

On Wednesday, March 15, 2023, at 8 p.m. ET, we will be releasing a new version of the QualityNet Operations Dashboard (QNOD). This release will cause QNOD to be unavailable until 10 p.m. ET.

Why is this release happening?

New versions of QNOD are released each sprint to ensure that QNOD users have continued access to the latest product features and enhancements as well as fixes to any issues that have been resolved.

What are some of the enhancements included in the upgrade?

Users can expect the following new functionality with this release as well as fixes and/or security patching.

New features:

  • New Service Onboarding
    • VPN ASA
      • Onboarded initial KPIs: CPU Used Percent and Memory Used Percent.

 

  • Machine Learning Enablement – Anomaly Detection (Released March 13)
    • New Relic
      • This AI/ML capability with deep learning models provides service owners with a 24-hour historical view of anomalies that may have occurred with their service.
      • This capability will aid service owners in investigating root causes as well as fixing issues with their service that would otherwise lead to a potential future service issue or degradation.
    • SAS Viya
      • This AI/ML capability with deep learning models provides service owners with a 24-hour historical view of anomalies that may have occurred with their service.
      • This capability will aid service owners in investigating root causes as well as fixing issues with their service that would otherwise lead to a potential future service issue or degradation.


  • Machine Learning Enablement – Uptime Prediction (Released March 13)
    • New Relic
      • This AI/ML capability with deep learning models provides the service owner with a look ahead of 10 minutes into the future for any potential issues at the KPI level, with an 86% confidence level in this prediction. This gives service owners an opportunity to investigate their service and look for potential issues.
      • Predictions are available for the following KPIs: Queue Depth.

Service Adjustments and Baseline Reconciliation Improvements:

  • Barracuda
    • Updated component and KPI weights to more accurately represent their contribution to overall service health.
    • Added ELB and Email Gateways to those already being monitored.
    • Added failing state thresholds to In and Out Queue KPIs.
  • Confluence
    • Removed non-impacting KPIs and updated component and KPI weights to more accurately represent overall service health.
  • iQIES
    • Removed non-impacting KPI's to more accurately represent overall service health.
  • QMARS Fax
    • Removed sunsetting Reverse Proxy subsystem.
  • Nexus
    • Removed sunsetting IQ Auditor and IQ Firewall subsystems.
  • Zscaler
    • Removed non-impacting KPIs and updated component and KPI weights to more accurately represent overall service health.

Resolved issues:

  • Adjusted New Relic query intervals to avoid gaps in EC2 Status Check data for some Active Directory hosts.
  • Resolved an issue where a missing entity in HARP data interfered with Anomaly Detection for that service.
  • Resolved minor issues with panel labels and rendering on the QIES drill-down dashboard.

 Known issues:

  • CCSQ QuickSight – Occasional insufficient data false positives for User Experience KPIs
  • EQRS SF – No data is available for User Experience KPIs (“canaries”) while the Portal team troubleshoots API endpoints. Service health alerts have been disabled for EQRS SF.
  • Time picker for service drill-down dashboards has been temporarily disabled to address possible issues with system stability


Tabs Page
titleMarch 2, 2023

QualityNet Operations Dashboard v1.23.1.3

Affected customers: All QualityNet users

On Thursday, March 2, 2023, at 8 p.m. ET, we will be releasing a new version of the QualityNet Operations Dashboard (QNOD). This release will cause QNOD to be unavailable until 10 p.m. ET.

Why is this release happening?

New versions of QNOD are released each sprint to ensure that QNOD users have continued access to the latest product features and enhancements as well as fixes to any issues that have been resolved.

What are some of the enhancements included in the upgrade?

Users can expect the following new functionality with this release as well as fixes and/or security patching.

New features:

  • Machine Learning Enablement – Uptime Prediction
    • HARP
      • This AI/ML capability with deep learning models provides the service owner with a look ahead of 30 minutes into the future for any potential issues at the KPI level, with an 86% confidence level in this prediction. This gives service owners an opportunity to investigate their service and look for potential issues. Where possible, predictions are made at the entity level (e.g. per host) to improve accuracy and diagnostic utility.
      • Predictions are available for the following KPIs: Hera APM Throughput, Login APM Throughput, SO Tool APM Throughput, Account Recovery CPU Used Percent, Account Recovery Memory Used Bytes, ADO API Memory Used Bytes, HERMES CPU Used Percent, HERMES Memory Used Bytes, HOMER CPU Used Percent, HOMER Memory Used Bytes, Registration CPU Used Percent, Registration Memory Used Bytes, SO Tool Memory Used Bytes, Utility Memory Used Bytes.
    • New Relic
      • This AI/ML capability with deep learning models provides the service owner with a look ahead of 10 minutes into the future for any potential issues at the KPI level, with an 86% confidence level in this prediction. This gives service owners an opportunity to investigate their service and look for potential issues.
      • Predictions are available for the following KPIs: New Relic Status.

 Baseline Reconciliation Improvements:

  • Ansible Tower
    • Added an additional synthetic monitor, "AT-PROD-URL."
  • ClamAV
    • Updated component and KPI weights to more accurately represent their contribution to overall service health.
    • Improved KPI legend specificity.
  • HARP
    • Improved resiliency of synthetics queries by using tags instead of synthetic names.
    • Updated KPI weights to more accurately represent their contribution to overall service health.
  • HIDS HARP Automation
    • Updated component and KPI weights to more accurately represent their contribution to overall service health.
  • Office365
    • Renamed "API Status" to "O365 Status API."
  • QIES
    • Removed MDS APM Error Rate KPI as it does not impact overall service health.

 Known issues:

  • CCSQ QuickSight – Occasional insufficient data false positives for User Experience KPIs
  • EQRS SF – No data is available for User Experience KPIs (“canaries”) while the Portal team troubleshoots API endpoints. Service health alerts have been disabled for EQRS SF.
  • Time picker for service drill-down dashboards has been temporarily disabled to address possible issues with system stability


Tabs Page
titleFebuary 15, 2023

QualityNet Operations Dashboard v1.23.1.2

Why is this release happening?

New versions of QNOD are released each sprint to ensure that QNOD users have continued access to the latest product features and enhancements as well as fixes to any issues that have been resolved.


What are some of the enhancements included in the upgrade?

Users can expect the following new functionality with this release as well as fixes and/or security patching.


New features:

  • Machine Learning Enablement – Anomaly Detection
    • HARP
      • This AI/ML capability with deep learning models provides service owners with a 24-hour historical view of anomalies that may have occurred with their service.
      • This capability will aid service owners in investigating root causes as well as fixing issues with their service that would otherwise lead to a potential future service issue or degradation.


Resolved issues:

  • Updated New Relic account information for the DELWeb service.
  • Updated PRS service Compute KPIs to reflect infrastructure migration to AWS Fargate.


Known issues:

  • CCSQ QuickSight – Occasional insufficient data false positives for User Experience KPIs
  • EQRS SF – No data is available for User Experience KPIs (“canaries”) while the Portal team troubleshoots API endpoints. Service health alerts have been disabled for EQRS SF.
  • Time picker for service drill-down dashboards has been temporarily disabled to address possible issues with system stability


Tabs Page
titleJanuary 9, 2023

QualityNet Operations Dashboard v6.8

Affected customers: All QualityNet users

On Monday, January 9, 2023, at 8 p.m. ET, we will be releasing a new version of the QualityNet Operations Dashboard (QNOD). This release will cause QNOD to be unavailable until 10 p.m. ET.

Why is this release happening?

New versions of QNOD are released each sprint to ensure that QNOD users have continued access to the latest product features and enhancements as well as fixes to any issues that have been resolved.

What are some of the enhancements included in the upgrade?

Users can expect the following new functionality with this release as well as fixes and/or security patching.

New features:

  • Machine Learning Enablement – Anomaly Detection
    • Ansible Tower
      • This AI/ML capability with deep learning models provides service owners with a 24-hour historical view of anomalies that may have occurred with their service.
      • This capability will aid service owners in investigating root causes as well as fixing issues with their service that would otherwise lead to a potential future service issue or degradation.

Resolved issues:

  • Resolved an issue where selecting a specific KPI for viewing or editing opened the incorrect KPI

 Known issues:

  • CCSQ QuickSight – Occasional insufficient data false positives for User Experience KPIs
  • EQRS SF – No data is available for User Experience KPIs (“canaries”) while the Portal team troubleshoots API endpoints. Service health alerts have been disabled for EQRS SF.
  • Time picker for service drill-down dashboards has been temporarily disabled to address possible issues with system stability


Tabs Page
titleDecember 21, 2022

QualityNet Operations Dashboard v6.7

Affected customers: All QualityNet users

On Wednesday, December 21, 2022, at 8 p.m. ET, we will be releasing a new version of the QualityNet Operations Dashboard (QNOD). This release will cause QNOD to be unavailable until 10 p.m. ET.

Why is this release happening?

New versions of QNOD are released each sprint to ensure that QNOD users have continued access to the latest product features and enhancements as well as fixes to any issues that have been resolved.

What are some of the enhancements included in the upgrade?

Users can expect the following new functionality with this release as well as fixes and/or security patching.

New features:

  • Machine Learning Enablement – Anomaly Detection
    • Jira
      • This AI/ML capability with deep learning models provides service owners with a 24-hour historical view of anomalies that may have occurred with their service.
      • This capability will aid service owners with investigating root causes as well as fixing issues with their service that would otherwise lead to a potential future service issue or degradation.
  • Machine Learning Enablement – Uptime Prediction
    • Jira
      • This AI/ML capability with deep learning models provides the service owner with a look ahead of 5 minutes into the future for any potential issues at the KPI level, with an 86% confidence level in this prediction. This gives service owners an opportunity to investigate their service and look for potential issues.
      • Predictions enabled for the following KPIs: APM Heap Used Percent, APM Throughput, Disk Free Percent, EFS Data Read IO, EFS Percent IO, Memory Used Percent, RDS CPU Used Percent, RDS Connections Count, RDS Freeable Memory, RDS Free Storage Space, RDS Read Latency, RDS Write Latency, Request Count, Synthetic Availability, Synthetic First Byte, Synthetic First Contentful Paint, Synthetic First Paint, Synthetic Latency, Synthetic On Page Load.

Resolved issues:

  • Airflow
    • Restored Process Count KPI
    • Transitioned CPU and Memory Used Percent KPIs from EC2 to ECS source
    • Removed EC2 Status Check and Disk Free Percent KPIs

Known issues:

  • CCSQ QuickSight – Occasional insufficient data false positives for User Experience KPIs
  • EQRS SF – No data is available for User Experience KPIs (“canaries”) while the Portal team troubleshoots API endpoints. Service health alerts have been disabled for EQRS SF.
  • Time picker for service drilldown dashboards has been temporarily disabled to address possible issues with system stability


Tabs Page
titleDecember 7, 2022

QualityNet Operations Dashboard v6.6

Affected Customers: All QualityNet users

 

On Wednesday, December 7, 2022, at 8 p.m. ET, we will be releasing a new version of the QualityNet Operations Dashboard (QNOD). This release will cause QNOD to be unavailable until 10 p.m. ET.

 

Why is this release happening?

New versions of QNOD are released each sprint to ensure that QNOD users have continued access to the latest product features and enhancements as well as fixes to any issues that have been resolved.

 

What are some of the enhancements included in the upgrade?

Users can expect the following new functionality with this release as well as fixes and/or security patching.

 

New Features:

  • Machine Learning Enablement - Uptime Prediction
    • ClamAV
      • Added Uptime Predictions to Disk Free Percent and Memory Used Percent KPIs
      • This AI/ML capability with deep learning models provides the service owner with a look ahead of 5 minutes into the future for any potential issues at the KPI level for the ClamAV service, with an 86% confidence level in this prediction. This gives service owners an opportunity to investigate their service and look for potential issues.

Issues Resolved:

  • Barracuda
    • Aligned QNOD queries with New Relic (one hour) polling interval to mitigate intermittent insufficient data periods for various KPIs
  • QDIVS (FIVS)
    • Restored Network KPIs with insufficient data after QDIVS network infrastructure changes.

 

Known Issue(s):

  • Airflow– No data is available for some KPIs due to an issue with the New Relic data source
  • Barracuda – Insufficient data overnight for Network KPIs
  • CCSQ QuickSight – Occasional insufficient data false positives for User Experience KPIs
  • EQRS SF – No data is available for User Experience KPIs (“canaries”) while the Portal team troubleshoots API endpoints. Service health alerts have been disabled for EQRS SF.
  • Time picker for service drilldown dashboards has been temporarily disabled to address possible issues with system stability


Tabs Page
titleNovember 28, 2022

QualityNet Operations Dashboard v6.5

Affected Customers: QNOD CMS Executive Users


On Wednesday, November 28, 2022, at 8 p.m. ET, we will be releasing a new version of our QualityNet Operations Dashboard (QNOD). This release will cause QNOD to be unavailable until 10 p.m. ET.


Why is this release happening?

New versions of QNOD are released each sprint to ensure that QNOD users have continued access to the latest product features and enhancements as well as fixes to any issues that have been resolved.


What are some of the enhancements included in the upgrade?

Users can expect the following new functionality with this release as well as fixes and/or security patching.


Issues Resolved:

  • Barracuda
    • Updated QNOD data ingestion after device migration
    • Add additional gateways to QNOD
  • Office365 
    • Updated QNOD data ingestion after device migration
    • Removed Office365 Process Count KPI
  • IQIES
    • Updated Web Transaction Time, SQS Oldest Message, and Visible Message
    • Corrected Redis Infrastructure unit type from “milliseconds” to “bytes”
    • Added ‘displayName’ facet to show all of Redis infrastructure
    • Updated the Redshift CPU legend to show hostnames


Known Issue(s):

  • Airflow - no data is available for some KPIs due to a data source issue unrelated to QNOD
  • Barracuda - intermittent insufficient data periods for Network KPIs
  • CCSQ QuickSight - occasional insufficient data false positives for User Experience KPIs
  • EQRS SF - no data is available for User Experience KPIs ("canaries") while the Portal team troubleshoots API endpoints
  • Time picker for service drilldown dashboards has been temporarily disabled to address possible issues with system stability


Tabs Page
titleNovember 9, 2022

QualityNet Operations Dashboard v6.4

Affected Customers: QNOD CMS Executive Users


On Wednesday, November 9, 2022, at 8 p.m. ET, we will be releasing a new version of our QualityNet Operations Dashboard (QNOD). This release will cause QNOD to be unavailable until 10 p.m. ET.


Why is this release happening?

New versions of QNOD are released each sprint to ensure that QNOD users have continued access to the latest product features and enhancements as well as fixes to any issues that have been resolved.


What are some of the enhancements included in the upgrade?

Users can expect the following new functionality with this release as well as fixes and/or security patching.


New Feature(s):

  • Onboarding for Service(s):
    • QPP
      • Auth API
        • Response Time
        • Error Percentage
        • ECS CPU Utilization
        • ECS Memory Utilization
        • ECS Task Running Count
        • RDS CPU Utilization
        • RDS Memory Utilization
        • Synthetic Latency
        • Synthetic Result
      • Content Management
        • 4XX/5XX Error Count
        • Request Count
        • ECS CPU Utilization
        • ECS Memory Utilization
    • HQR 
      • Analytics Cloud Reporting - full subsystem decomposition
        • Application
        • Compute
        • Network
        • User Experience

Issues Resolved:

  • PRS - Resolved insufficient data reported for some KPIs
  • SAS Viya - Removed KPIs that are no longer relevant to service health
    • Web Transactions Time
    • APM Error Rate
    • APM Throughput
    • PG Database Connections
  • EQRS Portal - Resolved insufficient data reported for User Experience KPIs
  • EQRS Scoring and Feedback - Resolved degraded service health after New Relic account changes
  • QSEP - Resolved insufficient data reported for some KPIs
  • HQR - Removed decommissioned services from some KPIs
  • Corrected typographical errors in some KPI names
  • Switched FireEye_vnx Network Response time metrics provider from Manage Engine to New Relic
    • Removed Availability Network component
    • Separated Response Time metric into two components


Known Issue(s):

  • Time picker for service drilldown dashboards has been temporarily disabled to address possible issues with system stability.


Tabs Page
titleOctober 27, 2022

QualityNet Operations Dashboard v6.3

On Wednesday, October 27, 2022, at 8 p.m. ET, we will be releasing a new version of our QualityNet Operations Dashboard (QNOD). This release will cause QNOD to be unavailable until 10 p.m. ET.


Why is this release happening?

New versions of QNOD are released each sprint to ensure that QNOD users have continued access to the latest product features and enhancements as well as fixes to any issues that have been resolved.


What are some of the enhancements included in the upgrade?

Users can expect the following new functionality with this release as well as fixes and/or security patching.


New Feature(s):

  • Onboarding for Service(s):
    • QPP
      • Clinicians Insights API
        • Response Time
        • Error Percentage
        • ECS CPU Utilization
        • ECS Memory Utilization
        • ECS Task Running Count
        • RDS CPU Utilization
        • RDS Read Latency
        • Synthetic Latency
        • Synthetic Result
      • Scoring API
        • Response Time
        • Error Percentage
        • ECS CPU Utilization
        • ECS Memory Utilization
        • ECS Task Running Count
        • RDS CPU Utilization
        • RDS Read Latency
        • Synthetic Latency
        • Synthetic Result
      • Front End
        • 4XX/5XX Error Count
        • Healthy Host Count
        • Request Count
        • ECS CPU Utilization
        • ECS Memory Utilization
        • ECS Task Running Count
        • Error Percentage
        • Response Time
        • Web Errors
        • Web Throughput
        • Web Transaction Time
      • Content Management
        • RDS – CPU Utilization
        • RDS – Free Storage Space
    • MedTrak
      • FSx Metrics
      • Removed Subsystems


  • Machine Learning Enablement – Uptime Prediction for the following service(s):
    • Barracuda
      • Corrected the QNOD visualization to reflect 5-minute predictions. This was changed in release 6.1, when the Barracuda model had 3 additional KPIs added and the prediction time was changed from 30m to 5m to improve KPI predictive accuracy.


Known Issue(s):

Time picker for service drilldown dashboards has been temporarily disabled to address possible issues with system stability.


Tabs Page
titleOctober 12, 2022

QualityNet Operations Dashboard v6.2

On Wednesday, October 12, 2022, at 8 p.m. ET, we will be releasing a new version of our QualityNet Operations Dashboard (QNOD). This release will cause QNOD to be unavailable until 10 p.m. ET.


Why is this release happening?

New versions of QNOD are released each sprint to ensure that QNOD users have continued access to the latest product features and enhancements as well as fixes to any issues that have been resolved.


What are some of the enhancements included in the upgrade?

Users can expect the following new functionality with this release as well as fixes and/or security patching.


New Feature(s):

  • Onboarding for Service(s):
    • QPP
      • Subsystems – Onboarded these subsystem components: Eligibility, Submissions API, Targeted Review & Self Nomination.
      • User Experience – Onboarded KPIs: Synthetic Availability, Synthetic Latency.
      • Compute – Onboarded KPIs:
      • ECS – CPU Utilization, Memory Utilization, Task running count.
      • RDS – BinLogDiskUsage, CPUUtilization, DatabaseConnections, DiskQueueDepth, FreeStorageSpace, ReadIOPS, ReadLatency, ReadThroughput, SwapUsage, WriteIOPS, WriteLatency,
      • Applications – Onboarded KPIs: Errors, Web Transaction Time, Web Throughput, HealthCheckStatus.
      • Network – Onboarded KPIs: Healthy Host Count, Request Count, HTTPCode_Target_4XX_Count, HTTPCode_Target_5XX_Count.
    • FAS
      • Network – Onboarded KPIs: Request Count, Active Connection Count, Processed Bytes, Response Time.


  • Machine Learning Enablement – Anomaly Detection for the following service(s):
    • ClamAV
      • This AI/ML capability with deep learning models provides service owners with a 24-hour historical view of anomalies that may have occurred with their service.
      • This capability will aid service owners with investigating root causes as well as fixing issues with their service that would otherwise lead to a potential future service issue or degradation.


  • Machine Learning Enablement – Uptime Prediction for the following service(s):
    • ServiceNow
      • Adding CPU Used Percent and Memory Used Percent.
      • This AI/ML capability with deep learning models provides the service owner with a look ahead of 30 minutes into the future for any potential issues at the KPI level for the Confluence service, with an 86% confidence level in this prediction. This gives service owners an opportunity to investigate their service and look for potential issues.


Known Issue(s):

  • Time picker for service drilldown dashboards has been temporarily disabled to address possible issues with system stability.


Tabs Page
titleSeptember 28, 2022

QualityNet Operations Dashboard v6.1

On Wednesday, September 28, 2022, at 8 p.m. ET, we will be releasing a new version of our QualityNet Operations Dashboard (QNOD). This release will cause QNOD to be unavailable until 10 p.m. ET.


Why is this release happening?

New versions of QNOD are released each sprint to ensure that QNOD users have continued access to the latest product features and enhancements as well as fixes to any issues that have been resolved.


What are some of the enhancements included in the upgrade?

Users can expect the following new functionality with this release as well as fixes and/or security patching.


New Feature(s):

  • Onboarding for Service(s):
    • EQRS Portal
      • Application – Onboarded KPIs: Redshift Database Connections, Redshift Read/Write Latency
      • Compute – Onboarded KPIs: Redshift; Health Status, CPU Used Percent, Disk Used Percent, Read/Write Throughput.
      • Network – ALB Request Count & Active Connections, API Gateway 5xx Errors.
    • QNP
      • Compute – Switched EC2 KPI’s with Fargate.
    • Mailman
      • Compute – Switched EC2 KPI’s with Fargate.
    • QPP
      • Application – Add Synthetic Maker synthetics. Fail Percent.
    • FAS
      • Compute – Onboarded KPIs: CPU, Memory, Disk for the service’s Infrastructure.
  • Machine Learning Enablement – Anomaly Detection for the following service(s):
    • ServiceNow and Splunk:
      • This AI/ML capability with deep learning models provides service owners with a 24-hour historical view of anomalies that may have occurred with their service.
      • This capability will aid service owners with investigating root causes as well as fixing issues with their service that would otherwise lead to a potential future service issue or degradation.
  • Machine Learning Enablement – Uptime Prediction additional KPIs for the following service(s): 
    • Barracuda
      • Adding CPU Used Percent, Volume Queue Length & Request Count to the existing KPIs of In Queue Count and Out Queue Count
      • This AI/ML capability with deep learning models provides the service owner with a look ahead of 30 minutes into the future for any potential issues at the KPI level for the Confluence service, with an 86% confidence level in this prediction. This gives service owners an opportunity to investigate their service and look for potential issues.


Known Issue(s):

  • Time picker for service drilldown dashboards has been temporarily disabled to address possible issues with system stability.



Tabs Page
titleSeptember 7, 2022

QualityNet Operations Dashboard v5.5

On Wednesday, September 7, 2022, at 8 p.m. ET, we will be releasing a new version of our QualityNet Operations Dashboard (QNOD). This release will cause QNOD to be unavailable until 10 p.m. ET.


Why is this release happening?

New versions of QNOD are released each sprint to ensure that QNOD users have continued access to the latest product features and enhancements as well as fixes to any issues that have been resolved.


What are some of the enhancements included in the upgrade?

Users can expect the following new functionality with this release as well as fixes and/or security patching.


New Feature(s):

  • Monthly Service Availability Dashboard
    • A new dashboard that is specific to service availability for all services and provides a single view of service availability for the past 30 days.  In addition, the view will include services that reported outages in the past 30 days.
  • Machine Learning Enablement – Anomaly Detection for the following service(s):
    • Jenkins:
      • This AI/ML capability with deep learning models provides service owners with a 24-hour historical view of anomalies that may have occurred with their service.
      • This capability will aid service owners with investigating root causes as well as fixing issues with their service that would otherwise lead to a potential future service issue or degradation.
  • Machine Learning Enablement – 30 Minute Uptime Prediction for the following service(s):
    • Ansible Tower:
      • This AI/ML capability with deep learning models provides the service owner with a look ahead of 30 minutes into the future for any potential issues at the KPI level for the Confluence service, with an 86% confidence level in this prediction. This gives service owners an opportunity to investigate their service and look for potential issues.


Known Issue(s):

  • Time picker for service drilldown dashboards has been temporarily disabled to address possible issues with system stability.


Tabs Page
titleAugust 26, 2022

QualityNet Operations Dashboard v5.4

On Friday, August 26, 2022, at 8 p.m. ET, we will be releasing a new version of our QualityNet Operations Dashboard (QNOD). This release will cause QNOD to be unavailable until 10 p.m. ET.


Why is this release happening?

New versions of QNOD are released each sprint to ensure that QNOD users have continued access to the latest product features and enhancements as well as fixes to any issues that have been resolved.


What are some of the enhancements included in the upgrade?

Users can expect the following new functionality with this release as well as fixes and/or security patching.


New Feature(s):

  • Onboarding for Service(s):
    • EQRS Scoring and Feedback
      • Synthetics – Determines the availability of the QMARS application; Duration and Success Percent are the two KPIs ingesting into QNOD.
      • Application – Onboarded KPIs: Web Transaction Time, Read/Write Latency, Transaction Throughput.
      • Compute – Onboarded KPIs: CPU, Memory, Disk, Read/Write IOPS for the service’s Infrastructure.
    • MAT
      • Application – Onboarded KPIs: Application Web Transaction Time, Error Rate, and Transaction Throughput; RDS Connections and Read/Write Latency.
      • Compute – Onboarded KPIs: EC2 CPU, Memory, and Disk; RDS CPU, Memory, Disk, and Read/Write IOPS; ECS CPU and Memory.
      • Network – Onboarded KPIs: ALB Requests and Active Connections.
    • QMARS
      • Synthetics – Add Legacy synthetics. Determines the availability of the QMARS application; Duration and Success Percent are the two KPIs ingesting into QNOD.
      • Network – Onboarded KPIs: ALB Active Connections, ALB Processed Bytes, and ALB Request Count.
    • QSEP
      • Application – Onboarded KPIs: Web Transaction Time, Read/Write Latency, Transaction Throughput, Processes Count, and Application Error Rate.
      • Compute – Onboarded KPIs: CPU, Memory, Disk, Read/Write IOPS for the service’s Infrastructure.
      • Network – ALB Request Count, ALB Avg. Healthy Host Count, ALB Response Time, Nginx Requests Per Second, Nginx Active Connections, Nginx Connections Waiting.
  • Machine Learning Enablement – 30 Minute Uptime Prediction for the following service(s):
    • Jenkins and Splunk:
      • This AI/ML capability with deep learning models provides the service owner with a look ahead of 30 minutes into the future for any potential issues at the KPI level for the Confluence service, with an 86% confidence level in this prediction. This gives service owners an opportunity to investigate their service and look for potential issues.


Performance Improvements:

  • Removed the additional components involved in data transfer between data collection and data storage layers. Also removed the additional components between data processing and data storage layers. This has resulted in significant improvements in performance and will enable quicker restoration compared to the current process. This will also reduce the downtime during deployments to 1–2 minutes compared to current 10–15 minutes.
  • Multiple orgs. within the Influxdb data are merged into one for better visibility and testing within the data storage layer.


Known Issue(s):

  • Time picker for service drilldown dashboards has been temporarily disabled to address possible issues with system stability.


Tabs Page
titleAugust 10, 2022

QualityNet Operations Dashboard v5.3

On Wednesday, August 10, 2022, at 8 p.m. ET, we will be releasing a new version of our QualityNet Operations Dashboard (QNOD). This release will cause QNOD to be unavailable until 10 p.m. ET.


Why is this release happening?

New versions of QNOD are released each sprint to ensure that QNOD users have continued access to the latest product features and enhancements, as well as fixes to any issues that have been resolved.


What are some of the enhancements included in the upgrade?

Users can expect the following new functionality with this release as well as fixes and/or security patching.


New Feature(s):

  • Onboarding for Service(s):
    • QMARS
      • Synthetics – Determines the availability of the QMARS application; Duration and Success Percent are the two KPIs ingesting into QNOD.
      • Application – Onboarded KPIs: Web Transaction Time, Read/Write Latency, Transaction Throughput.
      • Compute – Onboarded KPIs: CPU, Memory, Disk, Read/Write IOPS, for the service’s Infrastructure.
    • QIES
      • Split out KPIs into distinct applications for CASPER, MDS, and PBJ.
      • User Experience – Expand synthetic tests to include following KPIs: 'First Byte', 'First Paint', 'First Contentful Paint', and 'Page Load'.
      • Application – Update KPIs for APM Response Time, APM Throughput, APM Error Rate.
  • Machine Learning Enablement – Anomaly Detection for the following service(s):
    • Confluence and Syslog:
      • This AI/ML capability with deep learning models provides service owners with a 24-hour historical view of anomalies that may have occurred with their service.
      • This capability will aid service owners with investigating root causes as well as fixing issues with their service that would otherwise lead to a potential future service issue or degradation.
  • QNOD Notifications:
    • Confluence and Syslog:
      • Using the Slack channel #alerts-qnod-prod, the Anomaly Detection process from the Machine Learning capability above will send alerts when an anomalous event is detected in these services. An alert will also be sent when the condition has cleared for the service.
      • This capability will aid service owners with identifying and quickly fixing issues with their service that would otherwise lead to a potential service issue or degradation.
  • Service Availability:
    • Service Availability percentage for the past 24 hours is now displayed in service drilldown dashboards.
      • The Service Availability percentage refers to the percentage of time service stayed in a “Non-Critical” state out of the total time reported to New Relic.


Issue(s) Resolved:

  • QIES
    • User Experience – corrected a failing synthetic test.


Known Issue(s):

  • Time picker for service drilldown dashboards has been temporarily disabled to address possible issues with system stability.


Tabs Page
titleJuly 27, 2022

QualityNet Operations Dashboard v5.2

New Feature(s):

  • Synthetics for the following service(s):
    • MedTrak
  • Full Decomposition for the following service(s):
    • QCOR
    • QDIVS
    • MedTrak
  • Machine Learning Enablement – 30-Minute Uptime Prediction Model for the following service(s):
    • Confluence – This AI/ML capability with deep learning models provides the service owner with a look ahead of 30 minutes into the future for any potential issues at the KPI level for the Confluence service, with an 86% confidence level in this prediction. This gives service owners an opportunity to investigate their service and look for potential issues.
  • Machine Learning Enablement – Anomaly Detection for the following service(s):
    • Barracuda – This AI/ML capability with deep learning models provides service owners with a 24-hour historical view of anomalies that may have occurred with their service. This capability will aid service owners with investigating the root cause and fixing any issues with their service that would otherwise lead to a potential future service issue or degradation. 

Issue(s) Resolved:

  • None this release

Known Issue(s):

  • Time picker for service drilldown dashboards has been temporarily disabled to address possible issues with system stability.


Tabs Page
titleJuly 21, 2022

QualityNet Operations Dashboard v5.1.2

New Feature(s):

  • Machine Learning Enablement – 30 Minute Uptime Prediction Model for the following service(s):
    • Barracuda
    • Syslog


Tabs Page
titleJuly 14, 2022

QualityNet Operations Dashboard v5.1.1

Roll Back New Feature(s):

  • Roll Back Machine Learning Enablement – 30 Minute Uptime Prediction Model for the following service(s):
    • Barracuda
    • Syslog


Tabs Page
titleJuly 13, 2022

QualityNet Operations Dashboard v5.1


New Feature(s):

  • Full Decomposition for the following service(s):
    • QIES
    • QTSO


  • Machine Learning Enablement – 30 Minute Uptime Prediction Model for the following service(s):
    • Barracuda
    • Syslog


Issue(s) Resolved:

  • The following issues with service drilldown dashboards are fixed:
    • AD – Process Count KPI panels updated to show process names along with host names.
    • Certificate Authority – Process Count KPI panels updated to show process names along with host names.
    • ClamAV – Process Count KPI panels updated to show process names along with host names.
    • PRS – Updated legend name for the KPI panels.
    • Hive – Average Healthy Host Count and Process Count KPI panels updated with correct legend names.
  • Drilldown dashboards for Office365, DELWeb, and McAfee WG services fixed to show host names in KPI panel labels.


Known Issue(s):

  • Time picker for service drilldown dashboards has been temporarily disabled to address possible issues with system stability.


Tabs Page
titleJune 22, 2022

QualityNet Operations Dashboard v4.6

New Feature(s):

  • Synthetic Monitoring for the following service(s):
    • MedTrax
  • Full Decomposition for the following service(s):
    • EQRS Portal Service
    • HARP/HIDS Automation (Additional KPIs)
  • WAN - New devices added into New Relic and reporting the same in QNOD


Issue(s) Resolved:

  • Resolved the issue with the Alerter process to be able to send notifications for service state changes
  • Fixed the ‘Disk Free Percent’ KPI reported via New Relic for multiple services
  • Updated F5 URLs synthetic test scripts to accommodate the F5 network device’s move to new hardware


Tabs Page
titleJune 8, 2022

QualityNet Operations Dashboard v4.5

New Feature(s)

  • Synthetic Monitoring for the following service(s):
    • EQRS Scoring and Feedback
    • QIES
    • QTSO
  • Full Decomposition for the following service(s):
    • iQIES/PASCID
  • Additional reports have been added to the Grafana Metrics API:
    • The Recovery Rate report calculate the ratio of failed deployments to the total number of deployments, shown on a quarterly bases.
    • The Mean Time to Recover report shows, on a quarterly basis, shows the average amount of time it takes for application to recover from a failed deployment


Tabs Page
titleJune 1, 2022

QualityNet Operations Dashboard v4.4

New Feature(s)

  • Synthetic Monitoring for the following services
    • FAS
    • QCOR
    • HARP/HIDS Automation
    • MFT
  • AWS RSS messages
    • US-East-1 and global regions are continuously retrieved from AWS and available at #aws-rss-alerts Slack channel
  • New dashboards available
    • 24-hour Service Issues Summary
    • Service Issues Reports
  • New component added for New Relic service drilldown to capture New Relic minion (synthetic test monitors) health

Bug Fixes:

  • FireEye ETP
    • Updated Synthetic Availability to where it no longer reports a constant degraded state.
  • FireEye vNX
    • Updated Interface KPIs to where they are no longer reporting a constant degraded state
  • McAfee GW
    • Updated Interface KPIs to where they are no longer reporting a constant degraded state
  • Certificate Authority
    • Both Disks are now reporting correctly, and the overall service health is more accurate
  • Nexus
    • Fixed thresholds for Disk, previously reporting “Insufficient Data” when there was indeed data
  • Slack


Tabs Page
titleMay 11, 2022

QualityNet Operations Dashboard v4.3

New Feature(s)

  • Synthetic Monitoring for the following services:
    • QDIVS
    • Bonnie/MAT
    • QSEP/ITSP
    • CCSQ QuickSight
  • Full Decomposition for the following services:
    • DEL
  • Anomaly detection displayed for the following services:
    • Confluence
  • Improved reporting
    • AWS evaluation now includes reports from the AWS global and us-east-1 public health API
  • Landing page Enhancements
    • Service panels now include a hover function to display current service health %.
    • New service status icon to identify services which have KPI issues yet do not fully affect a service’s state.
  • Metric Sources Dashboard
    • A dashboard displaying a summary of metric sources is now available (metric summaries dashboard) and is available as a link from the landing page.
    • A dashboard drilling into each metric source, detailing degraded, failed, and missing values is available from the metric summaries dashboard.

Bug Fixes:

  • Services where KPI’s were appearing as blank on the service decomposition diagram on a service drilldown now appear as No Data (grey color)
  • Missing KPI’s for services now contribute to the overall state of a service.
  • Most services were not being evaluated in their ‘minutes_threshold’ value via their service decomposition definitions. This caused a single point to affect the overall state of a service. Service KPI’s are now being evaluated correctly.  


Tabs Page
titleApril 27, 2022

QualityNet Operations Dashboard v4.2

New Feature(s)

  • Full Decomposition for the following services:
    • Ambari Infrastructure
    • TestRail
    • SAS Viya
    • Zeppelin


Tabs Page
titleApril 13, 2022

QualityNet Operations Dashboard v4.1

New Feature(s)

  • Full Decomposition for the following services:
    • Airflow
    • Hive
    • Ranger

Infrastructure Upgrade

  • Grafana upgraded to v8.4.4 from v8.1.8


Tabs Page
titleMarch 23, 2022

QualityNet Operations Dashboard v3.5

New Services

  • EQRS Portal
  • iQIES

 

New Features/Bug Fixes

  • Thresholds for FileCloud and Routing services are adjusted to display the status of the application accurately.
  • Fixed the Request API Count KPI to display data and services having Request Count start reporting data. 


Tabs Page
titleMarch 9, 2022

QualityNet Operations Dashboard v3.4

Issues Resolved

  • DNS, Office365, AD, Certificate Authority - resolved issue where KPIs were not reporting in QNOD after a New Relic upgrade 

 

New Features

  • HQR - added Application and Network metrics, expanded Compute metrics, and added two more subsystems
  • HARP - moved from Collaboration panel to Identity & Access panel
  • Modified thresholds from 0 minutes to 3 minutes to reduce noise


Tabs Page
titleFebruary 23, 2022

QualityNet Operations Dashboard v3.3

New Services

  • CDR (Ambari, HIVE, and Ranger subsystems)
  • HQR
  • DELWeb

 

New Features

  • HARP Service drilldown updated to include the subsystems along with HOMER subsystem.
  • Updated service drilldown dashboards to include Jira issues panel.


Tabs Page
titleFebruary 9, 2022

QualityNet Operations Dashboard v3.2

New Services

  • Airflow
  • SAS Viya
  • Zeppelin

 

New Features

  • FireEye vNX
    • Added Device Availability and Response Time KPIs
  • Metrics API
    • Added new report APIs to communicate deployment recovery metrics and new roles to API keys to enhance security


Tabs Page
titleJanuary 31, 2022

QualityNet Operations Dashboard v3.1.1

Category Changes in Current Service Status Overview Dashboard:

  • Zscaler has moved from Security to Network
  • Syslog has moved from Security to Monitoring


Issues Resolved

  • Provided a fix to the NewRelic service to reflect the current health status more accurately. 
  • Implemented update to the F5 service synthetic tests to reflect the current service health status in the dashboard.


Tabs Page
titleJanuary 26, 2022

QualityNet Operations Dashboard v3.1

New Features:

  • New Services
    • Network
      • AWS
      • QMARS Fax (Biscom)
      • Network Routing
      • Presentation Zone
      • WAN Connectivity
    • Collaboration
      • TestRail


Tabs Page
titleJanuary 6, 2022

QualityNet Operations Dashboard v2.7.1

New Features:

  • DevSecOps Metrics API MVP
    • Usable functionality will be a dashboard presenting the number of deployments per day, per application.
    • Please reach out to Tim Regulski for an API Key and instructions on configuration


Tabs Page
titleJanuary 5, 2022

QualityNet Operations Dashboard v2.7.0

New Features:

  • Entity Discovery Automation
    • We can catalog all devices on our data sources.
    • Provides functionality for us to be much faster in understanding what devices exist and what data gaps we may have for a particular service.
  • DAS QNOD Integration Prep
    • Worked with the HARP team to create a new DAS entitlement with HARP to provide the teams access to QNOD.
  • SaaS Issue Reporting
    • Slack service drill down can display RSS feeds for the latest active and resolved incidents. 


New Services:

  • SNOW
  • F5
  • FirePower IPS
  • Office 365/Exchange
  • QNET
  • Mailman
  • PRS
  • FireEye VNX



Tabs Page
titleDecember 15, 2021

QualityNet Operations Dashboard v2.6.0

New Features:

  • Implemented Grafana upgrade from 8.1.2 to 8.1.8 to address CVE-2021-43798.
  • QNOD alert notifications per service can be configured to send to multiple Email distros and Slack channels.  Alerts will be sent on service state changes.

 

New Services:

  • McAfee Web Gateway
  • Trend Micro DS


Tabs Page
titleDecember 9, 2021

QualityNet Operations Dashboard v2.5.0

Architecture Improvements

  • Data collection processes have been decoupled from each other on a per-service basis. This will improve performance and make it less likely for a failure in on service's data pull to affect others.


New Features:

  • Enable QNOD notifications to send alerts to Email distribution or Slack channel
  • Service Status will now be derived from the weighted system health score. This will improve the accuracy of the system status and make it less subject to a single KPI status


New Services:

  • CA Certificates
  • Survey Monkey
  • FireEye ETP


Issues Resolved:

KPIs that are designated to alarm only after a specified period of time will now alarm only after the specified period as intended


Tabs Page
titleNovember 17, 2021

QualityNet Operations Dashboard v2.4.0

New Features: 

  • Added Service Health Panel to all dashboards to display the weighted score of the service over time
  • Added Current Issues Dashboard so that KPI issues are seen on a single dashboard


New Service: 

  • Active Directory


Tabs Page
titleNovember 8, 2021

QualityNet Operations Dashboard v2.3.1

Issues Resolved:

  • Updated the view for Current Service Status Overview Dashboard
  • Fixed the unittype in the FileCloud and MFT service drilldown panels
  • Nexus service divided into subsystems for more visibility into the service


Tabs Page
titleNovember 3, 2021

QualityNet Operations Dashboard v2.3

New Features

  • Migrated to serverless metric ingestion, increasing reliability and efficiency
  • Improved the layout of the current status dashboard to more succinctly show service groups
  • Fixed the display of certain metrics for Jira and Confluence
  • Implemented Notifications sent to Slack for service status change


New Services:

  • DNS
  • Slack


Tabs Page
titleOctober 20, 2021

QualityNet Operations Dashboard v2.2

New Features

  • Added weighted system health score to represent system health in a more dynamic way
  • Implemented automatic creation of dashboard drill-downs to ensure consistency and improve velocity
  • Produced POC/MVP of automatic discovery and metric ingestion engine
  • Redesigned the current status overview dashboard
  • Incorporated Unit Tests for Lambda functions



Tabs Page
titleOctober 1, 2021

QualityNet Operations Dashboard v2.0

Issues Resolved

  • KPIs, where no data is okay, will no longer affect the component or system status
  • Tuning of the logging frequency for Flux tasks



Tabs Page
titleSeptember 1, 2021

QualityNet Operations Dashboard v1.5

New Services:

  • MFT
  • Tenable Nessus


New Features:

  • Enhanced the layout of the drilldown and service dependency tree diagram to improve the viability of the KPIs.


Tabs Page
titleAugust 26, 2021

QualityNet Operations Dashboard v1.4.1

New Features:

  • Improved Performance -- serverless computing (AWS Lambda) was deployed to increase the efficiency and timing of queries, which lowers the load up to 90% on database tier
  • No Data -- the dashboard now displays 'No Data' if there is insufficient data to represent the service's status and KPI
  • UI Improvements -- various small improvements to the UI, color uniformity, panel type, etc


Bug Fixes:

  • "No Query Returned Results" error has been fix on the Executive Dashboard


Tabs Page
titleAugust 20, 2021

QualityNet Operations Dashboard v1.4

New Service(s):

The following service(s) will be integrated with the new release:

  • Barracuda (Mailman)
  • HARP


New Features:

  • FileCloud, Nexus, and Splunk are fully decomposed with metrics provided in their drilldowns
  • Syslog synthetic test panels added to drilldowns
  • Confluence and Jira data ingest processing and visualization for Network and Database metrics added in drilldowns


Bug Fixes:

  • Added info to panels in drilldown dashboard


Tabs Page
titleAugust 11, 2021

QualityNet Operations Dashboard v1.3.1

Bug Fixes:

  • Infrastructure upgraded to address the intermittent "query not returning results" error on the Dashboard panels
  • Updated AMI to address security compliance findings


Tabs Page
titleAugust 4, 2021

QualityNet Operations Dashboard v1.3

New Service(s):

The following service(s) will be integrated with the new release:

  • GitHub
  • NewRelic


New Feature(s):

Service Drill Downs

  • Ansible, Jenkins, GitHub and New Relic drill-downs provide fully decomposed metrics.

 

Bug Fixes:

The following issues will be resolved with the new release:

  • QNet Dashboard Logo updated with a new transparent icon.
  • Logic to determine component status updated to reflect correct status of component.
  • Service Dependency Diagram panel updated for better visibility.
  • Updated the Executive Dashboard view to render the service status correctly.



Tabs Page
titleJuly 21, 2021

QualityNet Operations Dashboard v1.2

New Service(s):

  • ZScaler


What is new:

Service Drill Downs

  • Confluence, JIRA, and ZScaler drill-downs provide fully decomposed metrics.
  • Added scan latency panel to ClamAV User Experience component.

  • Link to Confluence page with Jira issues fixed to ensure that it opens in a new tab instead of same tab.


Tabs Page
titleJuly 14, 2021

QualityNet Operations Dashboard v1.1.1

Bug Fixes:

  • Increased the number of containers to 2 for Grafana to fix the "503 service temporarily unavailable"
  • Fixed the link to Jira issues page on confluence from dashboard.
  • Increased the query timeout and query concurrency in Influxdb to resolve the "query length limit exceeded" error.
  • Increased the CPU and memory allocation for Grafana, Influxdb and Telegraf.


Tabs Page
titleJuly 7, 2021

QualityNet Operations Dashboard v1.1

What is New:

The following systems have their own drilldown dashboard:

  • Confluence
  • JIRA
  • Service Now
  • FileCloud
  • Ansible
  • Nexus
  • Jenkins
  • Splunk
  • ClamAV
  • Syslog


What is included in this release:

  • Upgraded to Grafana v8.0.3
  • 508 Accessibility
    • Added 'alt' attributes to images
    • Removed heading <h1,2,3,4> attributes
    • Fixed some contrast issues
  • Removed 3 semi-hidden panels, reduced code by 180 lines


Tabs Page
titleJune 2, 2021

QualityNet Operations Dashboard v1.0

The following Applications are currently being monitored for availability (up/down):

  • Confluence
  • JIRA
  • Service Now
  • FileCloud
  • Ansible
  • Nexus
  • Jenkins
  • Splunk


What is included in this release:

  • Removed the ability for users to log into the dashboard with local accounts, users are forced to have a HARP account
  • Okta/HARP integration for authentication
  • 4 hour authentication timeout after no activity
  • Automated vulnerability scanning utilizing Netsparker and Nessus
  • Implemented Sonar Scanner to validate code in GitHub for vulnerabilities and bugs
  • Fixed Overlay UI issues
  • Fixed Panel Lengths so they all match and are even
  • Updated the queries to fix the service status results in Grafana
  • Grafana synthetic testing to validate dashboard availability
  • Container and Host based alerts to Slack ie CPU Utilization %, Memory Usage %, Disk Space, Host not responding, and Database storage utilization alerts


Known Issues:









Panel
borderColor#254b78
titleColor#ffffff
borderWidth1
titleBGColor#254b78
borderStylesolid
titleNeed Help?

For any questions please contact the Service Center by phone, email, or via Slack.  
Phone: 1-866-288-8914 (TRS: 711)
Email: 
ServiceCenterSOS@cms.hhs.gov
Slack: #help-service-center-sos

Dashboard Help Channel: #help-qnod-dashboard

Visit the Post-Consumer Onboarding page



Panel
borderColor#254b78
titleColor#ffffff
borderWidth1
titleBGColor#254b78
borderStylesolid
titleWe Want Your Feedback

The QNOD product team actively seeks user feedback on our product. Your feedback helps us decide what features we should prioritize and build as well as understand how our users think about our product. Sign up to learn when we have future opportunities.

How we collect feedback:

  • Messages sent to the #help-qnod-dashboard Slack channel
  • Surveys sent to the QNET community
  • Interviews with users
  • Live polls during QNET calls
  • Usability exercises
  • On-site feedback form (coming soon)