Header Image - Computational Scientist

Tag Archives

One Article

Domain Specific Language for “Variable Transformation” with PySpark

by Jiao 0 Comments

In quantitative finance, a sophisticated model system could consist of a large collection of supervised ML models. Each ML model may require hundreds of features, many of which are transformed from “raw” input data columns using PySpark. Managing consistency of these transformations across hundreds of ML models in an integrated model system can be a daunting task, let alone the evolution of these transformations in model research and development. A domain specific language for expressing such transformations was invented to provide not only a much more clean and succinct grammar, but also a structured specification that can be automatically scanned for human errors such as cyclic definitions, conflicts, typos, etc. It greatly enhances the efficiency and productivity of model development.