Better Sample Data: Call for Developers
The Better Sample Data Working Group (BSDWG) of the Product Council is pursuing the contribution of a large FOLIO dataset from an implementing library, to use as the basis for a new FOLIO Bugfest dataset. The library’s ability to get university approval for sharing this dataset depends on a successful anonymization process.
The BSDWG requests that FOLIO’s governance Councils, as well as the POs, endorse and help distribute a call for development contributions from the community. The anonymization scripts written for proof-of-concept were successful, but more development is required before the scripts are ready to implement. Developer volunteers would be working with Lee Braginsky from EBSCO to finish and deploy the anonymization scripts on a test dataset.
Preliminary efforts
Needs and requirements
Time commitment
180 working hours for coding and deployment
Generous estimate; actual work hours will depend somewhat on approach (Java vs. Python/Airflow)
Number of developers: 2
Or an equivalent number of contributed hours
Timeframe
2 sprints
For Trillium or immediately after
Languages and platforms:
Java (used for POC)
Python & Airflow (as an alternative, used in post-POC preliminary efforts)
SQL
FOLIO Schematool