Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Include Page
Data Nav_C
Data Nav_C


Back to Townhalls Home Page


Data Catalog Pilot

  • The Data & Analytics Team is seeking volunteers to help evaluate a candidate data catalog solution.
  • Evaluation sessions will be held between February 19, 2024,and March 15, 2024.
  • For more information or to express interest, contact us in Slack at #ccsq_data_analytics.
  • Image Removed
    Image Removed
    Image Removed

    CCSQ Data & Analytics Townhall

    Date

    Wednesday, January 24, 2024, at 1:00 pm ET

    Recording

    January Townhall Recording

    Presentation Slides

    January Townhall Presentation Slides

    Agenda

    1. Monthly Satisfaction Survey & Results
    2. Unified File Management
    3. Data Catalog Pilot
    4. Implicit vs. Explicit Connections
    5. Best Practices: Working with Small Datasets for Testing
    6. Q&A
    Monthly Satisfaction Survey Review & Poll

    Reference the recording to see the results of the previous month's poll. (1:09)

    Unified File Management & Demo

    Unified File Management

    • QualityNet FileCloud is being decommissioned as part of the transition to the new Unified File Management service.
    • FileCloud users will automatically be migrated to Unified File Management during this transition period.
    • Important Dates: 
      • January 22, 2024: Personal Files (the most recent versions only) will be migrated from FileCloud into Unified File Management system with an expected completion date of February 5, 2024.
      • February 8, 2024: Team Folders (the most recent versions only) will be migrated from FileCloud into the Unified File Management system with an expected completion date of March 1, 2024.
      • By March 1, 2024: Network Shares will also be migrated to Unified File Management

    United File Management – Top FAQs

    • Q: What is Unified File Management?
      • A: Unified File Management is a single system for the management of data files.
    • Q: How do I prepare for the transition to Unified File Management?
      • A: Review all files and data. Any content that is outdated or no longer needed should be removed.
    • Q: Will I still be able to access my files through FileCloud?
      • A: Once files are migrated to Unified File Management, they will no longer be accessible.
    • Q: Which files are being migrated to Unified File Management?
      • A: The most recent version of your personal files (My Files), Team Folders and Network Shares will transition to the new system.
    • Q: What functionality is not supported in Unified File Management?
      • A: File versioning, file locking, marking files as favorite and Drive letter

    Need more information?

    For a complete list of FAQs visit the QualityNet IT Services FileCloud > Decommission FAQs page or send a message via our Slack channel: #help-ufm.

    Data Catalog PilotImplicit vs. Explicit Connections in SAS
    Data SourceData TargetCan I Use SQL Syntax?Best Connection TypeSAS Usage 
    CDRHIVE SCHEMAYESEXPLICIT%HIVE_EXEC_SQL()
    CDRWORK BENCHN/AEXPLICIT%SELECT_TO_DATASET()
    WORK BENCHHIVE SCHEMAN/AIMPLICIT

    If the table size is small:

    SAS data step to %dbx_lib()

    If the table size is too large:

    %_write_large_sas_file_to_dbx()

    CDRHIVE SCHEMANO

    1.EXPLICIT

    then

    2.IMPLICIT

    Best Practice: Working with Small Datasets for Testing

    Best Practice: Working with Small Datasets for Testing

    What are my priorities when writing and testing code?

    1. Can I access my data?
    2. Do I have the right information for any merges or transformations?
    3. Is my syntax correct?

    How much data do I need to use to meet my goals?

    • For one and three – not much. Even one or two records would be sufficient.
    • For two – varies by process, but 100 – 1000 records would likely be sufficient.

    How does less data help me?

    • Faster processing
    • Faster syntax error identification
    Code Block
    languagesass
    titleSAS Syntax Example
    data work.cars_subset;
    
    set sashelp.cars (obs=10);
    
    run;
    Code Block
    languagesql
    titleSQL Syntax Example
    CREATE TABLE
    
    public_data.zip_codes_subsets AS
    
    SELECT *
    
    FROM public_data.zip_codes
    
    LIMIT 100;

    Did You Know that you can use your databricks notebook to test your SQL code?

    Since SAS explicit SQL processes (i.e., %hive_exec_sql(), %select_to_dataset()) pass your code to be processed in databricks, you can develop and test your queries directly in databricks, then copy-paste the final query into the SAS macro. 

    Your databricks notebook logs are more informative and you will not be competing with as many users since these compute clusters are by organization. 

    NOTE: Your databricks notebook will not be able to resolve your SAS macro variables.

    Q&A