FZ Workshop At OSU

Thank you for attending the FZ workshop at Columbus!

Key Information and Links:

  • We use Eastern Time (ET) for this workshop.
  • Date: Sept. 17-18, (Day 1) 8:30 AM to 5:00 PM, (Day 2) 8:30-noon
  • Address: Ohio State University Dept. of Computer Science
  • Meeting room: Dreese Laboratory, Room 263
  • Meeting Zoom: One Zoom link for all the sessions
  • Meeting docs: All slides can be found in this shared folder
  • Parking Address: Tuttle Garage, 18USD/day
  • Host: Dr. Hanqi Guo

Schedule

Day Before (Sept. 16)

  • 6:00 PM Dinner

Day 1 (Sept. 17)

  • 8:30 AM Welcoming by Anish Arora (Distinguished Professor of Engineering and Chair, Computer Science and Engineering, Faculty Director, 5G-OH Connectivity Center), intro/presentation of the schedule: 15 minutes
  • 8:45 AM State of the Project/Vision (Chair: Franck)
  • 9:00 AM Thrust I: Compression API and Generators (Chair: Robert) Progress on Compression API and Generators: 75 minutes
    • Robert Underwood: Compiler Abstractions
    • Shihui Song: CERESZ-II
    • Yafan Huang: cuSZp and Compression for Light Sources
    • Discussion: Compiler Abstractions for Heterogeneous
  • 10:15 AM Break: 15 minutes
  • 10:30 AM Thrust II: Compression Module Library (Chair: Kai): 75 minutes
    • Kai Zhao: LCP and Gromacs progress
    • Zizhe Jian: CLI-SZ
    • Kai Zhao: MDZ-v2
  • 11:45 AM Open discussion: 15 minutes
  • 12:00 PM Working Lunch: retrospective how are things going so far for adoption? What is working/not?: 60 minutes
  • 01:00 PM Thrust III.A: Visualization, quality assessment, and optimization: (Chair: Hanqi 75 minutes)
    • Hanqi Guo: Progress on FZ-Viz
    • Yuxiao Li: MSZ preserving topological features with compression
    • Hrithik Devaiah Bollachettira Ajithkumar: Integration with Visualization Tools
    • Congrong Ren: Unstructured Mesh Compression
    • Yongfeng Qiu: lossy compression for parallel vis
    • Yang Zhang (UIUC): Auto-encoders for Temporal Data
  • 02:15 PM Thrust III.B: Integrating Compression/Prediction into Applications (Chair: Sheng 75 minutes)
    • Jiajun Huang: Codesigning Compression with Communication
    • Sheng Di for Arham Khan: SECRE
    • Hasanur Md Rahman: FRXZ
  • 03:30 PM Break: 15 minutes
  • 03:45 PM GPU Compression Modules (Chair: Jiannan Tian): 75 Minutes
    • Jiannan Tian: Engineering for GPU Compressors, Multi-Stream, Lossless on GPU
    • Boyuan Zhang: FZ-GPU, AI Compression on the GPU
  • 04:30 PM Open discussion: 30 minutes (website)
  • 05:00 PM Closing: 10 minutes

Day 2 (Sept. 18)

  • Parallel track 1 (Chairs: Sheng, Kai)
    • 08:30 AM Modularization of CLIZ and MGARD decorrelation? 1:30h
    • 10:00 AM Overall integration? 1h
  • Parallel track 2 (Chairs: Robert, Hanqi)
    • 08:30 AM Tutorial for new students (dry run of new material for SC24 tutorial?)
  • 11:00 Scaling Adoption: Publicity, Website, etc. Getting the Project Website Ready for SC.
  • 12:00 Closing/Lunch (food provided)

FZ workshop

Thank you for considering attending the FZ workshop at Sarasota!

Dates, address, contact:

  • Date: Feb 14-15, 9:00 AM to 5:00 PM
  • Address: The John and Mable Ringling Museum of Art, 5401 Bay Shore Rd, Sarasota, FL 34243
  • Meeting room: Chao Lecture Hall.
  • Important: Do NOT use the main museum entrance. Please refer to this map for parking and entrance.
  • Meeting Zoom: One Zoom link for all the sessions
  • Meeting docs: All slides can be found in this shared folder
  • Please fill out this form no matter you will attend in person or online)
  • We use Eastern Time (ET) for this workshop.
  • Host: Dr. Kai Zhao (kai.zhao@fsu.edu)

Schedule:

  • Day before (Feb. 13)

    • 5:00 PM Reception (Cost on your own, Rico’s Pizzeria & Pasta House, Address: 5131 N Tamiami Trail, Sarasota, FL 34234)
  • Day 1 (Feb 14)

    • 8:30 AM Door opens for Chao lecture hall
    • 8:45 AM Please arrive at the Chao lecture hall by 8:45 AM.
    • 9:00 AM Workshop opening by Dr. Franck Cappello
    • 9:05 AM Welcome talk by Dr. Stacey Patterson (Vice President for Research, Florida State University)
    • 9:15 AM Welcome talk by Dr. Varun Chandola (NSF Program Director, CISE/OAC)
    • 9:20 AM A general introduction: workshop participants with their expertise.
    • 9:40 AM Training on existing compressors (SZ, ZFP, LC, SPERR, LibPressio, etc.) (1h)
      • Overview (40 min)
      • Hands-on (20 min)
    • 10:40 Break (20 min)
    • 11:00 AM A general introduction to the FZ project and the 3 different thrusts of the FZ project (programming interface and specific compressor generation, building of the compression module library, visualization, quality assessment, and optimization) (1h)
    • 12:00 PM Lunch (Cost on your own)
    • 1:30 PM A presentation of the FZ project progress so far and the next milestones (1h)
    • 2:30 PM A discussion about FZ module design with other compressors (30min)
    • 3:00 PM Break (20 min)
    • 3:45 PM Application session, part 1 (1h40min)
      • (4 application domains, 15 min each) A presentation of the different application domain requirements and constraints concerning lossy compression by the application attendees (1h)
    • 5:00 PM End of day
  • Day 2 (Feb 15)

    • 9:00 AM Application session, part 2 (1h40min)
      • (5 application domains, 12 min each) A presentation of the different application domain requirements and constraints concerning lossy compression by the application attendees (1h)
      • one-to-one break-out sessions with the application developers and users to collect (i) use case requirements concerning compression ratio, speed, and accuracy criteria, (ii) practical compression interface requirements, including APIs and I/O library integration, and shell command. (40min)
        • Group 1-4 (20 min)
          • Group application 1: Climate, lead: Robert + compressor developers
          • Group application 2: Seismology, lead: Dingwen + compressor developers
          • Group application 3: Quantum circuit, lead: Sheng + compressor developers
          • Group application 4: Fusion, lead: Hanqi + compressor developers
        • Group 5-9 (20 min)
          • Group application 5: Cosmology, lead: Dingwen + compressor developers
          • Group application 6: Light sources, lead: Robert + compressor developers
          • Group application 7: Molecular Dynamics, lead: Kai + compressor developers
          • Group application 8: Combustion, lead: Hanqi + compressor developers
          • Group application 9: System logs, lead: Sheng + compressor developers
        • In parallel, Preparation of the slides summarizing discussion/test results for every application. Robert, Dingwen, Sheng, Kai, Hanqi (2 application domains each).
    • 10:40 AM Break (20 min)
    • 11:00 AM Hackathon sessions where multiple existing compression schemes will be tested for every application to identify relevant compression methods and gaps that could be addressed with lossy compressor customization (1h)
    • 12:00 PM Lunch (Cost on your own)
    • 1:00 PM Presentation of the discussion/test results for every application (1h30)
    • 2:30 PM Discussion/Reconciliation of the break-out session results
    • 5:00 PM End of the workshop

Photos:

Kickoff Meeting

Kickoff Meeting to be taken place on 15th Sept 2023 at IUPUI.

Thank you for considering attending the FZ Kickoff Meeting!

All slides for talks in the meeting can be found in this shared folder.

Here is the schedule:

  • 8:30 AM Welcome/intro/presentation of the schedule: 15 minutes
  • 8:45 AM Review of the project objectives and deliverables: 15 minutes [slides]
    • Description of the general modular design (modules for pipeline generation, modules for quality assessment, modules for optimization)
  • 9:00 AM Programming Interface and Compressor Generators: 75 minutes
    • Robert: 10 minutes to introduce the topic, discuss gaps, and development plan, and present progress [slides]
    • Dingwen: 10 minutes about some specifics of GPUs [slides]
  • 10:15 AM Break: 15 minutes
  • 10:30 AM Compression module library (modules for compression pipeline composition): 75 minutes
    • Kai: 10 minutes to introduce the topic, discuss gaps, and development plan, and present progress [slides]
    • Presentation from Martin (10 minutes) [slides]
    • Presentation from Jon C. (10 minutes) [slides]
    • Presentation from Xin (10 minutes) [slides]
  • 11:45 AM Open discussion: 15 minutes
  • 12:00 PM Working Lunch: how to maximize adoption?: 60 minutes
  • 01:00 PM Visualization, quality assessment, and optimization: 75 minutes
    • Hanqi: 10 minutes to introduce the topic, discuss gaps, and development plan, and present progress [slides]
    • Xiaodong: 10 minutes on Z-checker GPU quality assessment
    • Lin: 10 minutes: How to use topology to evaluate compression output.
    • Robert and Sheng: 10 minutes on optimization [slides]
    • Dingwen student (Daoce): 10 minutes on quality assessment (cosmology application, AMR data) [slides]
    • Hanqi student (Congrong): 10 minutes on quality assessment (topology) [slides]
  • 02:15 PM Co-design of FZ with Application Partners: 45 minutes
  • 03:00 PM Adoption: Encouraging and Tracking, Measure of success: 30 minutes
  • 03:30 PM Break: 15 minutes
  • 03:45 PM Outreach/Education (Classes/Tutorials, BOFs): 30 minutes
  • 04:15 PM Next in-person meeting
  • 04:30 PM Open discussion: 30 minutes
  • 05:00 PM Closing: 10 minutes

ISC21 Tutorial

The ISC Tutorials are interactive courses focusing on key topics of high performance computing, networking, storage, and data science. Renowned experts in their respective fields will give attendees a comprehensive introduction to the topic as well as providing a closer look at specific problems. Tutorials are encouraged to include a “hands-on” component to allow attendees to practice prepared materials.

The Tutorials will be held on Thursday, June 24, and on Friday, June 25, 2021.

The ISC 2021 Tutorials Committee is headed by Kevin Huck, University of Oregon, USA, with Kathryn Mohror, Lawrence Livermore National Laboratory, USA, as Deputy Chair.

International Workshop on Big Data Reduction (IWBDR)

Today’s modern applications are producing too large volumes of data to be stored, processed, or transferred efficiently. Data reduction is becoming an indispensable technique in many domains because it can offer a great capability to reduce the data size by one or even two orders of magnitude, significantly saving the memory/storage space, mitigating the I/O burden, reducing communication time, and improving the energy/power efficiency in various parallel and distributed environments, such as high-performance computing (HPC), cloud computing, edge computing, and Internet-of-Things (IoT). An HPC system, for instance, is expected to have a computational capability of floating-point operations per second, and large-scale HPC scientific applications may generate vast volumes of data (several orders of magnitude larger than the available storage space) for post-anlaysis. Moreover, runtime memory footprint and communication could be non-negligible bottlenecks of current HPC systems.

Tackling the big data reduction research requires expertise from computer science, mathematics, and application domains to study the problem holistically, and develop solutions and harden software tools that can be used by production applications. Specifically, the big-data computing community needs to understand a clear yet complex relationship between application design, data analysis and reduction methods, programming models, system software, hardware, and other elements of a next-generation large-scale computing infrastructure, especially given constraints on applicability, fidelity, performance portability, and energy efficiency. New data reduction techniques also need to be explored and developed continuously to suit emerging applications and diverse use cases.

There are at least three significant research topics that the community is striving to answer: (1) whether several orders of magnitude of data reduction is possible for extreme-scale sciences; (2) understanding the trade-off between the performance and accuracy of data reduction; and (3) solutions to effectively reduce data size while preserving the information inside the big datasets.

The goal of this workshop is to provide a focused venue for researchers in all aspects of data reduction in all related communities to present their research results, exchange ideas, identify new research directions, and foster new collaborations within the community.

More information can be found [here]

ECP Annual Meeting tutorial

Compression for scientific data

Lossy Compression for Scientific Data - Success Stories

Compression for scientific data

Compression for scientific data

Lossy Compression for scientific data

Large-scale numerical simulations and experiments are generating very large datasets that are difficult to analyze, store and transfer. This problem will be exacerbated for future generations of systems. Data reduction becomes a necessity in order to reduce as much as possible the time lost in data transfer and storage. Lossless and lossy data compression are attractive and efficient techniques to significantly reduce data sets while being rather agnostic to the application. This tutorial will review the state of the art in lossless and lossy compression of scientific data sets, discuss in detail two lossy compressors (SZ and ZFP) and introduce compression error assessment metrics. The tutorial will also cover the characterization of data sets with respect to compression and introduce Z-checker, a tool to assess compression error.

More specifically the tutorial will introduce motivating examples as well as basic compression techniques, cover the role of Shannon Entropy, the different types of advanced data transformation, prediction and quantization techniques, as well as some of the more popular coding techniques. The tutorial will use examples of real world compressors (GZIP, JPEG, FPZIP, SZ, ZFP, etc.) and data sets coming from simulations and instruments to illustrate the different compression techniques and their performance. This 1/2 day tutorial is improved from the evaluations of the two highly attended and rated tutorials given on this topic at ISC17 and SC17.

Compression for scientific data

DescriptionLarge-scale numerical simulations, observations and experiments are generating very large datasets that are difficult to analyze, store and transfer. Data compression is an attractive and efficient technique to significantly reduce the size of scientific datasets. This tutorial reviews the state of the art in lossy compression of scientific datasets, discusses in detail two lossy compressors (SZ and ZFP), introduces compression error assessment metrics and the Z-checker tool to analyze the difference between initial and decompressed datasets. The tutorial will offer hands-on exercises using SZ and ZFP as well as Z-checker. The tutorial addresses the following questions: Why lossless and lossy compression? How does compression work? How measure and control compression error? The tutorial uses examples of real-world compressors and scientific datasets to illustrate the different compression techniques and their performance. Participants will also have the opportunity to learn how to use SZ, ZFP and Z-checker for their own datasets. The tutorial is given by two of the leading teams in this domain and targets primarily beginners interested in learning about lossy compression for scientific data. This half-day tutorial is improved from the evaluations of the highly rated tutorials given on this topic at ISC17, SC17 and SC18.

Compression for scientific data

SC17 Tutorial

ISC17 Tutorial

Large-scale numerical simulations, observations and experiments are generating very large datasets that are difficult to analyze, store and transfer. This problem will be exacerbated for future generations of systems. Data compression is an attractive and efficient technique to significantly reduce the size of scientific datasets while being rather agnostic to the applications. This tutorial reviews the state of the art in lossless and lossy compression of scientific datasets, discusses in detail one lossless (FPZIP) and two lossy compressors (SZ and ZFP), introduces compression error assessment metrics and offers a hands on session allowing participants to use SZ, FPZIP and ZFP as well as Z-checker, a tool to comprehensively assess the compression error. The tutorial addresses the following questions: Why compression, and in particular lossy compression? How does compression work? How measure and control the compression error? What is under the hood of some of the best compressors for scientific datasets? The tutorial uses examples of real world compressors and scientific datasets to illustrate the different compression techniques and their performance. The tutorial is given by two of the leading teams in this domain and targets an audience of beginners and advanced researchers and practitioners in scientific computing and data analytics.

Content Level: 60% beginner, 30% intermediate, 10% advanced

Targeted Audience: This tutorial is for researchers, students and users of high performance computing interested in lossy compression techniques to reduce the size of their datasets: Researchers and students involved in research using or developing new data reduction techniques ; Users of scientific simulations and instruments who require significant data reduction.

Prerequisites: Participants are supposed to bring their own laptop, running Linux or MAC OS X. No previous knowledge in compression or programming language is needed.