Job Description
Employee Applicant Privacy Notice
Who we are:
Shape a brighter financial future with us.
Together with our members, we’re changing the way people think about and interact with personal finance.
We’re a next-generation financial services company and national bank using innovative, mobile-first technology to help our millions of members reach their goals. The industry is going through an unprecedented transformation, and we’re at the forefront. We’re proud to come to work every day knowing that what we do has a direct impact on people’s lives, with our core values guiding us every step of the way. Join us to invest in yourself, your career, and the financial world.
The Role:
The Risk Data Science team is looking for a Data Scientist/Senior Data Scientist to develop advanced machine learning and statistical models, guide measurement, strategy, and data-driven decision making to support various credit risk and operational areas at SoFi. The Data Scientist will work closely with Credit, Risk, Product, Engineering, and Operations teams to design solutions for underwriting, portfolio management, loss mitigation, and loss forecasting etc. These tasks involve researching and applying state of the art modeling methodologies to solve complex business problems. This role is very rewarding as your work will have a direct and immediate impact on the business’ profitability.
What You’ll Do:
Develop, implement, and continuously improve machine learning and statistical models that support various credit, risk, and operational procedures including but not limited to underwriting, portfolio management, loss mitigation, and loss forecasting, etc.
Present model performance and insights to Credit, Risk, and Business Unit leaders.
Proactively identify opportunities to apply advanced modeling approaches to solve complex business problems.
Explore and leverage in-house and external data sources to enhance model predictive power.
Collaborate with the Model Risk Management team to demonstrate models are developed with high level rigor that satisfy Model Risk Management and Governance requirements.
Perform ongoing monitoring of the models through the construction of dashboards and KPI tracking
Collaborate with the Product and Engineering teams to improve the model development, deployment, monitoring, and model re-calibration/re-build process..
Explore and apply in-house and open-source machine learning and statistical tools and algorithms to develop and improve models.
What You’ll Need:
Master’s degree in Statistics, Econometrics, Mathematics, Operations Research, Physics, Computer Science, Engineering, or quantitative field required. PhD degree preferred.
2+ years of relevant work experience in building and implementing machine learning and statistical models.
Excellent logic reasoning and communication abilities when interpreting business requirements and translating them into effective data solutions.
Strong skills in writing efficient SQL queries and Python code to create complex attributes, especially with large datasets.
Strong sensitivity to details in data and proactively investigate them to uncover unknown patterns.
Strong knowledge of databases and related languages/tools such as SQL, NoSQL, Hive, etc.
Demonstrated sophisticated experience in building efficient and reliable pipelines that interact with large datasets stored in SageMaker and Snowflake, automating recurring processes such as data extraction and processing, feature selection, model training, model monitoring, and generating documentation templates to support reproducibility and cross-functional collaboration.
Excellent knowledge of machine learning and statistical modeling methods for supervised and unsupervised learning. These methods include (but are not limited to) regression, classification, clustering, outlier detection, novelty detection, decision trees, nearest neighbors, support vector machines, ensemble methods and boosting, neural networks, deep ... (truncated, view full listing at source)