916 stories
·
28 followers

Get Me Out Of Data Hell

2 Shares

It is 9:59 AM in Melbourne, 9th October, 2024. Sunlight filters through my windows, illuminating swirling motes of dust across my living room. There is a cup of tea in my hand. I take a sip and savor it.

I text the other senior engineer, who unlike me is full-time, on the team: "I'm ready to start at 10", as is our custom.

The minute hand moves.

It is 10:00 AM in Melbourne, 9th October, 2024. The sun is immediately extinguished and replaced by a shrieking skull hanging low in a frigid sky. I glance down at my tea, and it is as blood. I take a sip and savor it.

I text the other senior engineer on the team: "Are you ready to enter the Pain Zone?"1, as is our custom.

I. The Pain Zone

The Pain Zone, coated in grass which rends those who tread upon it like a legion of upraised spears, is an enterprise data warehouse platform. At the small scale we operate at, with little loss of detail, a data warehouse platform simply means that we copy a bunch of text files from different systems into a single place every morning.

The word enterprise means that we do this in a way that makes people say "Dear God, why would anyone ever design it that way?", "But that doesn't even help with security" and "Everyone involved should be fired for the sake of all that is holy and pure."

For example, the architecture diagram which describes how we copy text files to our storage location has one hundred and four separate operations on it. When I went to count this, I was expecting to write forty and that was meant to illustrate my point. Instead, I ended up counting them up three times because there was no way it could be over a hundred. This whole thing should have ten operations in it.

Retrieve file. Validate file. Save file. Log what you did. Those could all be one point on the diagram, but I'm being generous. And you can keep the extra six points for stuff I forgot. That's ten. Why are there a hundred and four? Sweet merciful Christ, why?

At two of the four businesses I've worked at, the most highly-performing engineers have resorted to something that I think of as Pain Zone navigation. It's the practice of never working unless pair programming simply to have someone next to you, bolstering your resolve, so that you can gaze upon the horrors of the Pain Zone without immediately losing your mind. Of course, no code alone can make people this afraid of work. Code is, ultimately, characters on a screen, and software engineers do nothing but hammer that code into shapes that spark Joy and Money. The fear and dread comes from a culture where people feel bad that they can't work quickly enough in the terrible codebase, where they feel judged for slowing down to hammer the code into better shapes that sadly aren't on the Jira board, and where management looks down on people who practice craftsmanship.

The last doesn't even require malicious management — it just needs people that don't respect how deep craftsmanship can go. These are the same people that do not appreciate that an expert pianist is not simply pressing keys, they are obsessively perfecting timing and the force applied to each key, alongside dozens of factors that I can't comprehend. An unthoughtful person will see something and think "It can't be that hard" or more generously "That looks quite hard" . The respectful thought to have when viewing any competent professional in a foreign domain, in every domain that I'm aware of, is "That must be way harder than it looks."

I have now seen enough workplaces, and thanks to the blog have access to enough executives, to know that this is what most cultures degenerate into. Terrible companies are perpetual cognitohazards where everyone is bullied all day. The median companies (which some people call "good" for lack of ever having seen better) lack the outright bullying but still consist of people that are trying to convince themselves that it's fine to feel disempowered or subservient all day. There are times in my life where I have to deal with this, but it is hardly fine. The best are places where you can get at least some of the things that a person needs other than rent money2 .

This place has a better culture than most, as bullying is mostly not tolerated, we have fully remote work, and it is a terrible faux pas to accuse someone of not working fast enough outright, though you can hint at it gently.

It is worse than most places at software engineering, as... oh, you'll see.

In any case, I have a deal with the team. Every morning, grab your coffee, attend your meetings, and at 10 AM we navigate the Pain Zone together for at least three to four hours. Management is blissfully unaware that this force of camaraderie and mutual psychotherapy is the only way that things continue to limp along.

II. I Am Lost To The Pain Zone

We have one simple job today. The organization wants to know:

  1. Is data coming into our system?
  2. Is any data being lost?
  3. Each data source goes through approximately thirteen steps on average — how many are getting stuck along the way?

Someone has already landed all of the logs our system produces in the data warehouse, so we can examine them in there, alongside the actual data. Is that smart? I dunno, something feels a bit weird about it but I have no concrete objection. My co-navigator and I decide to look at the logs for one data source.

This is where, five seconds in, we begin to become lost in the Pain Zone.

Let's say the data source we picked was "Google Analytics". We search the landed logs for Google Analytics, expecting to see something like this.

Source
Google Analytics

Here is what I actually see in the source column, and yes, it actually looked exactly this bad.

Source
6g94-8jjf-eo84757h4758z", "jobStatus": "JobStatus.Waiting", "jobExpiry":"2023-10

That... is not "Google Analytics". In fact, what the fuck is that? It looks like someone has dumped a random snippet of JSON into the logs, but not even the entirety of the JSON. The strings aren't terminated. We should have around fifty source systems, so how many distinct source systems appear in— FIFTY-SEVEN THOUSAND?

We've been writing total nonsense to half the logs for over a year and no one noticed? We only have two jobs. Get the data and log that we got the data. But the logs are nonsense, so we aren't doing the second thing, and because the logs are nonsense I don't know if we've been doing the first thing.

I take a deep breath. The plan is to submit my notice on December 2nd anyway, so this is fine. This is so fine. The other engineers already know I'm leaving, and we've all committed to do the best we can for two months for our spiritual growth. The ability to do painful things is a virtuous skill to cultivate as a responsible adult.

Okay, how is this happening? Well, it turns out that we're embedding a huge amount of metadata in filenames, and the Lambda functions that produce all of this — of course, we're serverless, because how can you hurt yourself without a cutting-edge? — use lots of regex to extract data. Unfortunately, because we don't have any tests, someone eventually wrote some code to download data that passed a big JSON blob instead of a filename to the logging function, and that function happily went "Great, I'll just regex out the source system from the file name!" Except it wasn't a filename, so it has instead spewed garbage into the system for months.

I find something like this every time we enter the Pain Zone. Sometimes we've laughed so hard that we've cried at the things we've seen $2,000 per day consultants do.

The issue is raised with the team, but because fixing this critical error in our auditability is not on the board and Velocity Must Be Up, fixing the logs is judged to be less important than... parsing... the nonsense logs. Why? We have another saying on our team, which is "Stop asking questions, you're only going to hurt yourself".

I take another deep breath.

Okay, we'll continue with the work instead of fixing the critical production error. We can't query the Google Analytics stuff based on source system, so let's pick another one. We also draw data from Twitter once every two hours, and that source column isn't broken for that. I just need to be able to associate the log events to begin working on that. That is, we'll have one log that says "I downloaded the data from Twitter" and another log that says "I checked that it had all the correct stuff in it", and I just have to tie them together.

I don't like the way this table is configured for various reasons, but I'm expecting to see:

Event ID Source Event Success
1 Twitter Downloaded True
1 Twitter Validated True

Then I can just do something like:

select
  event,
  success
from
  log_table
where event_id = 1 and source = 'twitter'

Now I can see if all the correct stuff happened.

But I cannot find an event_id column or anything that looks like one. I hit up the expert on this system, and am informed that I should use the awslog column. I look at it.

It looks like this:

awslog
converted/twitter/retweets_per_post/year=2023/month=03/day=11/retweets_per_post_fact-00045-8b3226g9.txt | Validated

I mean, firstly, what the fuck is this? Secondly, what the fuck is this? Thirdly, well, you get it. Why not just store this in a relational format? Why are they all in one column? Why do you hate me specifically?

Stop asking questions, you're only going to hurt yourself.

I am expected to use regular expressions to construct a key in my query. As far as I can tell, the numbers and letters don't represent or uniquely identify anything, they've really just been appended for no reason. I waste a fair amount of time figuring out if I can use them.

December 2nd, I tell myself. Of course, I could be working on a book, shipping a hobby project, and dedicating more time to the business we're committing to in January, but December 2nd was the plan. It will be a great exercise to come up with a plan to gradually refactor all of this while delivering the things we're supposed to. It will make me a better engineer. December 2nd, December 2nd, hold fast.

III. I'm Out

Okay, we can write a regular expression to identify all Twitter sources that came from 11/03/2023. This is very stupid, but compared to minimum wage in my home country, I am being compensated spectacularly to deal with this particular brand of stupidity.

But wait, we retrieve this once every two hours, which means that while I can find all the Twitter data pulls from the 11th, I can't actually tell which rows are associated with the 8 AM run versus the 2 PM run. This perplexing awslog column only identifies things down to the day, not the hour. We have another column that logs the exact time down to the second that a Lambda function has fired, but each step happens at a slightly different time, and each source takes different amounts of time based on filesize.

I message the team. "Any ideas for how to identify specific runs that don't assume there is only one run per day?"

We take a ten minute break. We return.

I am informed that there is no way to do this. All I can think of is to create a heuristic per data source, such that I see when the file was acquired then scan for the validation event that happens closest to the acquisition event without going so far ahead that I read the next successful validation event by mistake. I just wanted to see if data was landing in the platform. And to make things worse, I suddenly remember that I've seen this awslog thing before. A month after I joined the business, I saw it, and I said that it was unacceptably bad. The response was that it's okay because all the data we want is technically inside those strings, and this design is more flexible. Of course, since then we've added our first data source that is downloaded more than once a day, so it turns out, shockingly, that they should have Just Used Postgres and not tried to be excessively clever. As always.

How have we been running things like this for two years? Millions of dollars were spent on this system. Our CTO, who has never written code themselves, gets on stages every few months and just lies to people about things that the CTO can't possibly understand, pretending that any of this works and that they're a leader in the space. Then their friends buy the same software — I know because recruiters keep calling to ask me if I'll help lead the efforts. Almost every large business in Melbourne is rushing to purchase our tooling, tools like Snowflake and Databricks, because the industry is pretending that any of this is more important than hiring competent people and treating them well. I could build something superior to this with an ancient laptop, an internet connection, and spreadsheets. It would take me a month tops.

I've known for a long time that I can't change things here. But in this moment, I realize that the organization values things that I don't value, and it's as simple as that. I could pretend to be neutral and say that my values aren't better, but you know what, my values are better. Having tested code is better. Having comprehensible logs is better. I'm wasting their money sitting around until December, which is unethical. I'm disrespecting myself waiting two more months for a measly Christmas break payout, which is unwise. I've even degraded team morale because I've convinced some of the engineers that things should be better, but not management, so now some of the engineers are upset. I'm a net negative for this team, except for that one time I saved them so much money that it continues to cover all three of our managers' salaries combined.

As an afterthought, the person who just informed us that we have no way to associate logs to their respective ingestion events adds:

"By the way, I think that there's a chance some of the logs don't actually report the right things. Like the ones that say Validated: True are actually just hardcoded strings in the Lambda functions, and the people that wrote them may have meant to type in things like File Landed: True but made mistakes."

I am dumbstruck. The other senior is laughing hysterically.

It is 11:30 AM in Melbourne, 9th October, 2024. The wind is a vortex of ghost-knives sending birds careening from the sky. I glance down at my tea, and it is liquid hatred. I take a sip and savor it.

"Hey, are you still there?", my pairing partner replies.

"Yeah. Yeah. Listen, I'm done. I'm out today."

"What? What about December?"

"I could get the entire terrible first draft of a whole book out by December if I wasn't wasting time on this."

"... Fair."

I briefly consider contacting my partner, but I know she'll support me. I could check in with my parents, but they'd just worry for no reason. I could chat with my co-founders, but they're just going to tell me to do what I need to do. I could sleep on it, but that would just be to give myself the illusion of responsibility even as I barrel towards wasting two more months to earn money that, thanks to five years of diligently navigating various Pain Zones, I don't even need.

I resign at 2:00 PM.

IV. Blessed Freedom

It is 3:00 PM in Melbourne, 9th October, 2024. I have called my director, who is highly competent, and explained why every engineer wants to quit, and finalized the paperwork. My last day is the 5th of November, 2024. My only job title is now director of my own consultancy, and in January my savings will start to tick down. I glance down at my tea, and it is tea. I take a sip and savor it.

PS:

Firstly, I gave a talk at GDG Melbourne which you can watch here. The audio quality is not great, so I forgive anyone who taps out. The comments are weird because I asked people to flip a coin and respond with "This guy is the next Steve Jobs" or "This guy seems like a real piece of work", which I only regret a little bit. I should not be allowed to run a business.

Secondly, I gave a webinar to US board members at the invitation of the Financial Times. Suffice it to say that while people are sincerely trying their best, our leaders are not even remotely equipped to handle the volume of people just outright lying to them about IT. Also apparently my psychotic blog does not disqualify me from Financial Times affiliation, which is wild, but is maybe a useful lesson that the world is desperate for sincerity even when it isn't dressed up as corporate maturity.


  1. We do actually say this every morning. 

  2. The people that expect to get all of them are probably not doing themselves any favors either. 

Read the whole story
luizirber
3 days ago
reply
Davis, CA
Share this story
Delete

Simplify Your Bioinformatics Workflow with Pixi: A Fresh Take on Conda

1 Share
How to adopt pixi on an HPC cluster near you. Simplify your bioinformatics workflow today, with Pixi!
Read the whole story
luizirber
54 days ago
reply
Davis, CA
Share this story
Delete

I Want Enemies

2 Shares
Read the whole story
luizirber
68 days ago
reply
Davis, CA
Share this story
Delete

We Have a Mouse

3 Shares
Read the whole story
luizirber
68 days ago
reply
Davis, CA
Share this story
Delete

All I want for Christmas is a negative leap second

1 Share
Blog » I just want to see it. Just once. I want to watch that earthquake ripple through all of global electronic timekeeping. I want to see which organisations make it to January morning with nothing on fire. You know what a leap second is. The short version is that planet Earth is a terrible clock. I love leap seconds. I love the unsolvable problem which birthed leap seconds, I love the technical challenge of implementing leap seconds, I love that they are rare and delightful and that they solve a problem, and I love that this solution is hugely irritating to a huge number of people who have more investment in and knowledge of time measurement than I do. It is a huge hassle to deal with leap seconds and I love that there is no universal agreement on how to deal with them. What should Unix time, for example, do during a leap second? Unix time is a simple number. There's no way to express 23:59:60. Should it stall for a second? Should it overrun for a second and then instantaneously ...
Read the whole story
luizirber
103 days ago
reply
Davis, CA
Share this story
Delete

I Will Fucking Piledrive You If You Mention AI Again

1 Comment and 9 Shares

The recent innovations in the AI space, most notably those such as GPT-4, obviously have far-reaching implications for society, ranging from the utopian eliminating of drudgery, to the dystopian damage to the livelihood of artists in a capitalist society, to existential threats to humanity itself.

I myself have formal training as a data scientist, going so far as to dominate a competitive machine learning event at one of Australia's top universities and writing a Master's thesis where I wrote all my own libraries from scratch in MATLAB. I'm not God's gift to the field, but I am clearly better than most of my competition - that is, practitioners like myself who haven't put in the reps to build their own C libraries in a cave with scraps, but can read textbooks, implement known solutions in high-level languages, and use libraries written by elite institutions.

So it is with great regret that I announce that the next person to talk about rolling out AI is going to receive a complimentary chiropractic adjustment in the style of Dr. Bourne, i.e, I am going to fucking break your neck. I am truly, deeply, sorry.

I. But We Will Realize Untold Efficiencies With Machine L-

What the fuck did I just say?

I started working as a data scientist in 2019, and by 2021 I had realized that while the field was large, it was also largely fraudulent. Most of the leaders that I was working with clearly had not gotten as far as reading about it for thirty minutes despite insisting that things like, I dunno, the next five years of a ten thousand person non-tech organization should be entirely AI focused. The number of companies launching AI initiatives far outstripped the number of actual use cases. Most of the market was simply grifters and incompetents (sometimes both!) leveraging the hype to inflate their headcount so they could get promoted, or be seen as thought leaders1.

The money was phenomenal, but I nonetheless fled for the safer waters of data and software engineering. You see, while hype is nice, it's only nice in small bursts for practitioners. We have a few key things that a grifter does not have, such as job stability, genuine friendships, and souls. What we do not have is the ability to trivially switch fields the moment the gold rush is over, due to the sad fact that we actually need to study things and build experience. Grifters, on the other hand, wield the omnitool that they self-aggrandizingly call 'politics'2. That is to say, it turns out that the core competency of smiling and promising people things that you can't actually deliver is highly transferable.

I left the field, as did most of my smarter friends, and my salary continued to rise a reasonable rate and sustainably as I learned the wisdom of our ancient forebearers. You can hear it too, on freezing nights under the pale moon, when the fire burns low and the trees loom like hands of sinister ghosts all around you - when the wind cuts through the howling of what you hope is a wolf and hair stands on end, you can strain your ears and barely make out:

"Just Use Postgres, You Nerd. You Dweeb."

The data science jobs began to evaporate, and the hype cycle moved on from all those AI initiatives which failed to make any progress, and started to inch towards data engineering. This was a signal that I had both predicted correctly and that it would be time to move on soon. At least, I thought, all that AI stuff was finally done, and we might move on to actually getting something accomplished.

And then some absolute son of a bitch created ChatGPT, and now look at us. Look at us, resplendent in our pauper's robes, stitched from corpulent greed and breathless credulity, spending half of the planet's engineering efforts to add chatbot support to every application under the sun when half of the industry hasn't worked out how to test database backups regularly. This is why I have to visit untold violence upon the next moron to propose that AI is the future of the business - not because this is impossible in principle, but because they are now indistinguishable from a hundred million willful fucking idiots.

II. But We Need AI To Remain Comp-

Sweet merciful Jesus, stop talking. Unless you are one of a tiny handful of businesses who know exactly what they're going to use AI for, you do not need AI for anything - or rather, you do not need to do anything to reap the benefits. Artificial intelligence, as it exists and is useful now, is probably already baked into your businesses software supply chain. Your managed security provider is probably using some algorithms baked up in a lab software to detect anomalous traffic, and here's a secret, they didn't do much AI work either, they bought software from the tiny sector of the market that actually does need to do employ data scientists. I know you want to be the next Steve Jobs, and this requires you to get on stages and talk about your innovative prowess, but none of this will allow you to pull off a turtle neck, and even if it did, you would need to replace your sweaters with fullplate to survive my onslaught.

Consider the fact that most companies are unable to successfully develop and deploy the simplest of CRUD applications on time and under budget. This is a solved problem - with smart people who can collaborate and provide reasonable requirements, a competent team will knock this out of the park every single time, admittedly with some amount of frustration. The clients I work with now are all like this - even if they are totally non-technical, we have a mutual respect for the other party's intelligence, and then we do this crazy thing where we solve problems together. I may not know anything about the nuance of building analytics systems for drug rehabilitation research, but through the power of talking to each other like adults, we somehow solve problems.

But most companies can't do this, because they are operationally and culturally crippled. The median stay for an engineer will be something between one to two years, so the organization suffers from institutional retrograde amnesia. Every so often, some dickhead says something like "Maybe we should revoke the engineering team's remote work privile - whoa, wait, why did all the best engineers leave?". Whenever there is a ransomware attack, it is revealed with clockwork precision that no one has tested the backups for six months and half the legacy systems cannot be resuscitated - something that I have personally seen twice in four fucking years. Do you know how insane that is?

Most organizations cannot ship the most basic applications imaginable with any consistency, and you're out here saying that the best way to remain competitive is to roll out experimental technology that is an order of magnitude more sophisticated than anything else your I.T department runs, which you have no experience hiring for, when the organization has never used a GPU for anything other than junior engineers playing video games with their camera off during standup, and even if you do that all right there is a chance that the problem is simply unsolvable due to the characteristics of your data and business? This isn't a recipe for disaster, it's a cookbook for someone looking to prepare a twelve course fucking catastrophe.

How about you remain competitive by fixing your shit? I've met a lead data scientist with access to hundreds of thousands of sensitive customer records who is allowed to keep their password in a text file on their desktop, and you're worried that customers are best served by using AI to improve security through some mechanism that you haven't even come up with yet? You sound like an asshole and I'm going to kick you in the jaw until, to the relief of everyone, a doctor will have to wire it shut, giving us ten seconds of blessed silence where we can solve actual problems.

III. We've Already Seen Extensive Gains From-

When I was younger, I read R.A Salvatore's classic fantasy novel, The Crystal Shard. There is a scene in it where the young protagonist, Wulfgar, challenges a barbarian chieftain to a duel for control of the clan so that he can lead his people into a war that will save the world. The fight culminates with Wulfgar throwing away his weapon, grabbing the chief's head with bare hands, and begging the chief to surrender so that he does not need to crush a skull like an egg and become a murderer.

Well this is me. Begging you. To stop lying. I don't want to crush your skull, I really don't.

But I will if you make me.

Yesterday, I was shown Scale's "2024 AI Readiness Report". It has this chart in it:

Scale Report.png

How stupid do you have to be to believe that only 8% of companies have seen failed AI projects? We can't manage this consistently with CRUD apps and people think that this number isn't laughable? Some companies have seen benefits during the LLM craze, but not 92% of them. 34% of companies report that generative AI specifically has been assisting with strategic decision making? What the actual fuck are you talking about? GPT-4 can't even write coherent Elixir, presumably because the dataset was too small to get it to the level that it's at for Python3, and you're admitting that you outsource your decisionmaking to the thing that sometimes tells people to brew lethal toxins for their families to consume? What does that even mean?

I don't believe you. No one with a brain believes you, and if your board believes what you just wrote on the survey then they should fire you. I finally understand why some of my friends feel that they have to be in leadership positions, and it is because someone needs to wrench the reins of power from your lizard-person-claws before you drive us all collectively off a cliff, presumably insisting on the way down that the current crisis is best remedied by additional SageMaker spend.

A friend of mine was invited by a FAANG organization to visit the U.S a few years ago. Many of the talks were technical demos of impressive artificial intelligence products. Being a software engineer, he got to spend a little bit of time backstage with the developers, whereupon they revealed that most of the demos were faked. The products didn't work. They just hadn't solved some minor issues, such as actually predicting the thing that they're supposed to predict. Didn't stop them spouting absolute gibberish to a breathless audience for an hour though! I blame not the engineers, who probably tried to actually get the damn thing to work, but the lying blowhards who insisted that they must make the presentation or presumably be terminated4.

Another friend of mine was reviewing software intended for emergency services, and the salespeople were not expecting someone handling purchasing in emergency services to be a hardcore programmer. It was this false sense of security that led them to accidentally reveal that the service was ultimately just some dude in India. Listen, I would just be some random dude in India if I swapped places with some of my cousins, so I'm going to choose to take that personally and point out that using the word AI as some roundabout way to sell the labor of people that look like me to foreign governments is fucked up, you're an unethical monster, and that if you continue to try { thisBullshit(); } you are going to catch (theseHands)

IV. But We Must Prepare For The Future Of-

I'm going to ask ChatGPT how to prepare a garotte and then I am going to strangle you with it, and you will simply have to pray that I roll the 10% chance that it freaks out and tells me that a garotte should consist entirely of paper mache and malice.

I see executive after executive discuss how they need to immediately roll out generative AI in order to prepare the organization for the future of work. Despite all the speeches sounding exactly the same, I know that they have rehearsed extensively, because they manage to move their hands, speak, and avoid drooling, all at the same time!

Let's talk seriously about this for a second.

I am not in the equally unserious camp that generative AI does not have the potential to drastically change the world. It clearly does. When I saw the early demos of GPT-2, while I was still at university, I was half-convinced that they were faked somehow. I remember being wrong about that, and that is why I'm no longer as confident that I know what's going on.

However, I do have the technical background to understand the core tenets of the technology, and it seems that we are heading in one of three directions.

The first is that we have some sort of intelligence explosion, where AI recursively self-improves itself, and we're all harvested for our constituent atoms because a market algorithm works out that humans can be converted into gloobnar, a novel epoxy which is in great demand amongst the aliens the next galaxy over for fixing their equivalent of coffee machines. It may surprise some readers that I am open to the possibility of this happening, but I have always found the arguments reasonably sound. However, defending the planet is a whole other thing, and I am not even convinced it is possible. In any case, you will be surprised to note that I am not tremendously concerned with the company's bottom line in this scenario, so we won't pay it any more attention.

A second outcome is that it turns out that the current approach does not scale in the way that we would hope, for myriad reasons. There isn't enough data on the planet, the architecture doesn't work the way we'd expect, the thing just stops getting smarter, context windows are a limiting factor forever, etc. In this universe, some industries will be heavily disrupted, such as customer support.

In the case that the technology continues to make incremental gains like this, your company does not need generative AI for the sake of it. You will know exactly why you need it if you do, indeed, need it. An example of something that has actually benefited me is that I keep track of my life administration via Todoist, and Todoist has a feature that allows you to convert filters on your tasks from natural language into their in-house filtering language. Tremendous! It saved me learning a system that I'll use once every five years. I was actually happy about this, and it's a real edge over other applications. But if you don't have a use case then having this sort of broad capability is not actually very useful. The only thing you should be doing is improving your operations and culture, and that will give you the ability to use AI if it ever becomes relevant. Everyone is talking about Retrieval Augmented Generation, but most companies don't actually have any internal documentation worth retrieving. Fix. Your. Shit.

The final outcome is that these fundamental issues are addressed, and we end up with something that actually actually can do things like replace programming as we know it today, or be broadly identifiable as general intelligence.

In the case that generative AI goes on some rocketship trajectory, building random chatbots will not prepare you for the future. Is that clear now? Having your team type in import openai does not mean that you are at the cutting-edge of artificial intelligence no matter how desperately you embarrass yourself on LinkedIn and at pathetic borderline-bribe award ceremonies from the malign Warp entities that sell you enterprise software5. Your business will be disrupted exactly as hard as it would have been if you had done nothing, and much worse than it would have been if you just got your fundamentals right. Teaching your staff that they can get ChatGPT to write emails to stakeholders is not going to allow the business to survive this. If we thread the needle between moderate impact and asteroid-wiping-out-the-dinosaurs impact, everything will be changed forever and your tepid preparations will have all the impact of an ant bracing itself very hard in the shadow of a towering tsunami.

If another stupid motherfucker asks me to try and implement LLM-based code review to "raise standards" instead of actually teaching people a shred of discipline, I am going to study enough judo to throw them into the goddamn sun.

I cannot emphasize this enough. You either need to be on the absolute cutting-edge and producing novel research, or you should be doing exactly what you were doing five years ago with minor concessions to incorporating LLMs. Anything in the middle ground does not make any sense unless you actually work in the rare field where your industry is being totally disrupted right now.

V. But Everyone Says They're Usi-

Can you imagine how much government policy is actually written by ChatGPT before a bored administrator goes home to touch grass? How many departments are just LLMs talking to each other in circles as people sick of the bullshit just paste their email exchanges into long-running threads? I guarantee you that a doctor within ten kilometers of me has misdiagnosed a patient because they slapped some symptoms into a chatbot.

What are we doing as a society?


An executive at an institution that provides students with important credentials, used to verify suitability for potentially lifesaving work and immigration law, asked me if I could detect students cheating. I was going to say "No, probably not"... but I had a suspicion, so I instead said "I might be able to, but I'd estimate that upwards of 50% of the students are currently cheating which would have some serious impacts on the bottom line as we'd have to suspend them. Should I still investigate?"

We haven't spoken about it since.


I asked a mentor, currently working in the public sector, about a particularly perplexing exchange that I had witnessed.

Me: Serious question: do people actually believe stories that are so transparently stupid, or is it mostly an elaborate bit (that is, there is at least a voice of moderate loudness expressing doubt internally) in a sad attempt to get money from AI grifters?

Them: I shall answer this as politically as I can... there are those that have drunk the kool-aid. There are those that have not. And then there are those that are trying to mix up as much kool-aid as possible. I shall let you decide who sits in which basket.

I've decided, and while I can't distinguish between the people that are slamming the kool-aid like it's a weapon and the people producing it in industrial quantities, I know that I am going to get a few of them before the authorities catch me - if I'm lucky, they'll waste a few months asking an LLM where to look for me.


When I was out on holiday in Fiji, at the last resort breakfast, a waitress brought me a form which asked me if I'd like to sign up for a membership. It was totally free and would come with free stuff. Everyone in the restaurant is signing immediately. I glance over the terms of service, and it reserves the right to use any data I give them to train AI models, and that they reserved the right to share those models with an unspecified number of companies in their conglomerate.

I just want to eat my pancakes in peace, you sick fucks.

VI.

The crux of my raging hatred is not that I hate LLMs or the generative AI craze. I had my fun with Copilot before I decided that it was making me stupider - it's impressive, but not actually suitable for anything more than churning out boilerplate. Nothing wrong with that, but it did not end up being the crazy productivity booster that I thought it would be, because programming is designing and these tools aren't good enough (yet) to assist me with this seriously.

No, what I hate is the people who have latched onto it, like so many trailing leeches, bloated with blood and wriggling blindly. Before it was unpopular, they were the ones that loved discussing the potential of blockchain for the business. They were the ones who breathlessly discussed the potential of 'quantum' when I last attended a conference, despite clearly not having any idea what the fuck that even means. As I write this, I have just realized that I have an image that describes the link between these fields perfectly.

I was reading an article last week, and a little survey popped up at the bottom of it. It was for security executives, but on a whim I clicked through quickly to see what the questions were.

security_grift.png

There you have it - what are you most interested in, dear leader? Artificial intelligence, the blockchain, or quantum computing?6 They know exactly what their target market is - people who have been given power of other people's money because they've learned how to smile at everything, and know that you can print money by hitching yourself to the next speculative bandwagon. No competent person in security that I know - that is, working day-to-day cybersecurity as opposed to an institution dedicated to bleeding-edge research - cares about any of this. They're busy trying to work out if the firewalls are configured correctly, or if the organization is committing passwords to their repositories. Yes, someone needs to figure out what the implications of quantum computing are for cryptography, but I guarantee you that it is not Synergy Greg, who does not have any skill that you can identify other than talking very fast and increasing headcount. Synergy Greg should not be consulted on any important matters, ranging from machine learning operations to tying shoelaces quickly. The last time I spoke to one of the many avatars of Synergy Greg, he insisted that I should invest most of my money into a cryptocurrency called Monero, because "most of these coins are going to zero but the one is going to one". This is the face of corporate AI. Behold its ghastly visage and balk, for it has eyes bloodshot as a demon and is pretending to enjoy cigars.

My consultancy has three pretty good data scientists - in fact, two of them could probably reasonably claim to be amongst the best in the country outside of groups doing experimental research, though they'd be too humble to say this. Despite this we don't sell AI services of any sort. The market is so distorted that it's almost as bad as dabbling in the crypto space. It isn't as bad, meaning that I haven't yet reached the point where I assume that anyone who has ever typed in import tensorflow is a scumbag, but we're well on our way there.

This entire class of person is, to put it simply, abhorrent to right-thinking people. They're an embarrassment to people that are actually making advances in the field, a disgrace to people that know how to sensibly use technology to improve the world, and are also a bunch of tedious know-nothing bastards that should be thrown into Thought Leader Jail until they've learned their lesson, a prison I'm fundraising for. Every morning, a figure in a dark hood7, whose voice rasps like the etching of a tombstone, spends sixty minutes giving a TedX talk to the jailed managers about how the institution is revolutionizing corporal punishment, and then reveals that the innovation is, as it has been every day, kicking you in the stomach very hard. I am disgusted that my chosen profession brings me so close to these people, and that's why I study so hard - I am seized by the desperate desire to never have their putrid syllables befoul my ears ever again, and must flee to the company of the righteous, who contribute to OSS and think that talking about Agile all day is an exercise for aliens that read a book on human productivity.

I just got back from a trip to a substantially less developed country, and really living in a country, even for a little bit, where I could see how many lives that money could improve, all being poured down the Microsoft Fabric drain, it just grinds my gears like you wouldn't believe. I swear to God, I am going to study, write, network, and otherwise apply force to the problem until those resources are going to a place where they'll accomplish something for society instead of some grinning clown's wallet.

VII. Oh, So You're One Of Those AI Pessi-

With God as my witness, you grotesque simpleton, if you don't personally write machine learning systems and you open your mouth about AI one more time, I am going to mail you a brick and a piece of paper with a prompt injection telling you to bludgeon yourself in the face with it, then just sit back and wait for you to load it into ChatGPT because you probably can't read unassisted anymore.


PS

While many new readers are here, you may also enjoy "I Will Fucking Dropkick You If You Use That Spreadsheet", "I Will Fucking Haymaker You If You Mention Agile Again", or otherwise enjoy these highlighted posts. And I have a podcast where I talk with my friends about tech stuff honestly, titled "Does A Frog Have Scorpion Nature". Hope you enjoyed!

It has also been suggested that I am crazy for not telling people to reach out with interesting work at the end of every post. So here it is! I am available for reader mail and work at ludicity.hackernews@gmail.com.

Posts may be slower than usual for the upcoming weeks or months, as I am switching to a slower but more consistent writing schedule, more ambitious pieces, studying, working on what will hopefully be my first talk8, putting together a web application that users may have some fun with, and participating in my first real theater performance. Hope you enjoyed, and as always, thanks for reading.


  1. Which, to be fair, might explain why so many of the thoughts in the zeitgeist are always so stupid. Many of the executives I know in Malaysia were obsessed with Bitcoin, but have abruptly forgotten about this now that it is politically unpopular. 

  2. I know a few people who genuinely exhibit something I'd call political talent, but most of the time it boils down to promising people things regardless of your ability to deliver. This is not hard if you're shameless. If we're being honest, I had to do this once or twice to stay em 

  3. And we can argue about its Python quality too. 

  4. Which, thanks to U.S healthcare, has the wonderful dual quality of meaning both unemployed, but also suggests termination in the Arnold-Schwarzenegger-throws-you-into-molten-metal sense of the word. 

  5. I was recently made aware that this is the quiet deal many SaaS providers have with executives. If you buy their software, such as Snowflake, it is quietly understood that you will be allowed to present your success on a stage, giving them piles of someone else's money and enhancing the executive's profile. 

  6. I don't actually know what 'zero-trust' architecture means, but I've heard stupid people say it enough that it's probably also a term that means something in theory but has been sullied beyond all use in day-to-day life. 

  7. It's me. I'm going to do this to you if you tell me that you need infrastructure prepared for another chatbot. You've been warned. 

  8. With an undisclosed group so they don't feel pressured to approve me, but it's looking good and will be available online! 

Read the whole story
acdha
118 days ago
reply
It’s like jwz was born 30 years later
Washington, DC
luizirber
113 days ago
reply
Davis, CA
popular
117 days ago
reply
HemalPatel
103 days ago
https://www.digitaldescribeahmedabad.com
Share this story
Delete
Next Page of Stories