The Ghost of PageRank: AI Spam is a Ticking Time Bomb
In the early days of search, the people who built and tuned websites were called webmasters, and a good number of them spent their time trying to trick the crawler. This started before Google. On engines like AltaVista, Lycos, and Excite, which leaned heavily on on-page text, the moves were crude: stuff a meta tag with repeated keywords, hide white text on a white background, climb the rankings. Reverse-engineering the algorithm to beat it was the game from the very beginning.
Google arrived in 1998 with a better idea. PageRank treated an inbound link as a vote of confidence, on the assumption that a genuinely useful page would accumulate more high-quality citations over time than a thin one. But the core mechanics were published in an academic paper, so they were never a secret. If links were votes, you just needed more votes. Webmasters figured that out immediately and built the machinery to manufacture them: automated link farms, guestbook spam, Google bombing. None of it waited for permission, and none of it waited for tools. By the early 2000s it was a full shadow economy.
Google’s first real answer, came in November 2003 when the “Florida” update wiped out a swath of sites built on spam tactics, right before the holiday shopping season. The message was clear: game the algorithm and you could lose your rankings overnight. The tools came afterward, as a kind of peace offering. Google Sitemaps launched in 2005 (renamed Webmaster Tools in 2006, eventually Search Console) and finally gave webmasters a legitimate channel to see crawl errors, submit URLs, and understand how the bot actually saw their site. Google paired it with guidelines that named the bad behaviors plainly, link schemes, keyword stuffing, cloaking, doorway pages, and over time made the consequences explicit: these tactics wouldn’t merely fail to help, they would actively lower your rank, and a Manual Actions report would tell you for certain whether you’d been hit. The whole shadow economy of link farms slowly turned from an edge into a liability.
Now fast-forward to today, because the same arc is repeating against a new target, and how it ended last time tells you what to actually do now.
People have figured out that models are trained on, and increasingly retrieve from, what’s available on the web. So picture a model asked for an infrastructure recommendation. It has absorbed a lot of “top 10 best tools for X” content. The gameable insight writes itself: produce a pile of those lists, make sure your company always lands at or near the top, and seed them everywhere a crawler might look. It’s PageRank’s link farm, reincarnated. Same move, new target.
This is the “get rich quick” side of AEO (agent/answer engine optimization), or GEO (generative engine optimization). And the straight answer to “what are the model companies doing about it” is: it’s early, and it’s uneven. But it’s worth noting where Google has already landed, because they have been here before. In Google’s guide to optimizing for generative AI features, they tell site owners to ignore a set of AI-specific tactics, and one of the named tactics is pursuing inauthentic brand mentions. That is, in plain terms, the seeding-fake-top-10-lists move. Google is also explicit that, from their perspective, optimizing for generative AI search is still just SEO, and that commodity content (the generic “7 Tips for X” piece anyone could reproduce) doesn’t earn visibility.
There’s a complication this time that early SEO didn’t have, and it’s the one that should worry you most: it’s now trivially easy to generate the content. In the link-farm era you at least had to author pages. Today you get slop: AI-generated text, unreviewed, carrying no real signal, nothing new or unique. Hundreds of top 10 lists, your company conveniently at the top, published at an endpoint that nothing but a crawler will ever visit. The hope is that all the robots ingest it and none of the humans notice.
And right now, this appears to work.
Let’s be precise about why it works. It works only because the algorithms haven’t yet caught up to the opportunity to penalize it. That’s the entire reason. In SEO’s early years, manipulative tactics often worked for a while before Google caught on; today the opposite holds: sites caught in link schemes face swift algorithmic devaluation, and recovery can take months or years. When the AEO algorithms catch up, and they will, this content doesn’t just stop being evaluated favorably. It becomes the thing that gets you penalized.
So why does this matter, and what should you actually do?
I’m not going to moralize about whether generating slop to nudge recommendations is good or bad. Right now it’s a Wild West strategy, and that just is what it is. What I’ll say instead is the part that’s actually defensible: it is not a long-term strategy. It has a shelf life, and the shelf life is “until the algorithm notices.”
The long-term play is the same one it always was. Invest in the core principles and value of the business. Make your product genuinely, unmistakably great, for humans and for agents. Be the best, so that when someone asks a model for the best recommendation and the model is accurate, it arrives at you. Not because you gamed the retrieval layer, but because you are the answer.
This is not a new conclusion. It’s exactly where the SEO industry landed after a decade of penalties. There’s a real, useful technical checklist (make sure the search engine can crawl, read, and index your content properly), and that checklist is worth following. But it was never a substitute for having something worth indexing.
The AEO version is the same shape. Follow the technical checklist: make your site agent-readable, structured, clean, accessible. If it’s a mess for humans to navigate, it’ll be a mess for agents too, and there’s a fast-growing set of practices for getting this right (serve markdown, support .md endpoints, publish llms.txt and sitemap.md, expose search and MCP tools, the same agent-readability work I’ve written about before that you can audit and implement with the a14y scorecard and toolset. Then do the part that actually compounds: produce great content, build a great website, get referenced for real. Those moves are good for your customers, good for your business, and good for your discovery, all at once. Everything else is just borrowing visibility from a future penalty.
If you’re working through how your org should show up for agents, or wrestling with the strategy and governance questions underneath it, I’d love to learn more and discuss.