ProArch is on the lookout for a talented Data Scientist to enhance our innovative team. The successful candidate will be responsible for designing and implementing predictive models, analyzing complex datasets, holds working knowledge of statistics, mathematics, and data science programming languages (e.g., SQL, R, Python), and have executed multiple projects taking ML concepts to production.
Your primary responsibilities will be performing statistical analyses, running custom SQL queries, and identifying patterns and trends that can improve the efficiency and usability of our products and services. You will also be expected to maintain and improve our data infrastructure and tools. Data Scientists are also expected to have good working knowledge and understanding of data engineering techniques, ETL, and ELT.
Key Responsibilities:
- Work with stakeholders to identify key metrics and opportunities for improving business processes, products, and services.
- Perform EDA (exploratory data analysis)
- Expertise in NLP and NLG.
- Good knowledge and understanding of LLM and its techniques, like RAG, RLHF, Prompt Tuning, etc.
- Good understanding of ML Ops, LLM Ops, and/or FM Ops.
- Build, deploy, and maintain data management systems and back-end data infrastructure for our business intelligence pipeline.
- Create data visualizations, reports, dashboards, and data audits.
- Design, train, and implement machine learning algorithms.
- Leveraging techniques like the ensemble to create high-performing ML Models.
- Bachelor's degree in data science, data analytics, or related field
- Master's degree or Ph.D. in a quantitative field, such as statistics, computer science, mathematics, or engineering
- Proficiency in Python, R, C#, Java, or Kotlin
- Knowledge of data science toolkits such as R, NumPy, and MatLab
- Experience with big data analytics technologies such as Spark/Databricks and Hadoop
- Experience with data visualization tools such as PowerBI and Tableau
- Expertise in data mining and machine learning
- Working knowledge of statistical models and business intelligence
- Familiarity with cloud-based infrastructure
- Ability to store and process unstructured data with NoSQL databases and machine learning models