Articles

  • Aug 11, 2024 | seattledataguy.substack.com | Seattle DataGuy

    As a consultant, I have been called in to review and, in many cases, replace dozens of half-finished, abandoned, and sometimes forgotten data infrastructure projects. The data infrastructure in a few cases may just need a little tweaking to operate effectively, but other times the project is either so incomplete or so lacking in a central design that the best thing to do is replace the old system.

  • Jan 4, 2024 | towardsdatascience.com | Seattle DataGuy |Matt Collins

    Use various data source types to quickly generate text data for artificial datasets. In a previous article, we explored creating many-to-one relationships between columns in a synthetic PySpark DataFrame. This DataFrame only consisted of Foreign Key information and we didn’t produce any textual information that might be useful in a demo DataSet.

  • Apr 25, 2023 | towardsdatascience.com | Seattle DataGuy

    A hands-on comparison using ChatGPT and Domain-Specific ModelChatGPT is a GPT (Generative Pre-trained Transformer) machine learning (ML) tool that has surprised the world. Its breathtaking capabilities impress casual users, professionals, researchers, and even its own creators. Moreover, its capacity to be an ML model trained for general tasks and perform very well in domain-specific situations is impressive. I am a researcher, and its ability to do sentiment analysis (SA) interests me.

  • Apr 24, 2023 | towardsdatascience.com | Seattle DataGuy |Barr Moses

    The post-modern data stack is coming. Are we ready? If you don’t like change, data engineering is not for you. Little in this space has escaped reinvention. The most prominent, recent examples are Snowflake and Databricks disrupting the concept of the database and ushering in the modern data stack era. As part of this movement, Fivetran and dbt fundamentally altered the data pipeline from ETL to ELT.

  • Apr 14, 2023 | seattledataguy.substack.com | Seattle DataGuy

    Regardless if you’re using Excel, Snowflake, or ChatGPT, the end goal of using data is to provide some sort of value. For businesses, this often means cutting costs or increasing revenue. Of course, the initiatives written out by a consultant or a CEO will never be titled something so bland. One example I have from my past was when I worked for a healthcare analytics company and we built a solution to help detect provider fraud.

Contact details

Socials & Sites

Try JournoFinder For Free

Search and contact over 1M+ journalist profiles, browse 100M+ articles, and unlock powerful PR tools.

Start Your 7-Day Free Trial →