☠️ How I Accidentally made my Link Shortener into a Malware Honeypot

By Maya Kyler on February 29, 2024

I launched y.gy a few months ago as a next-generation link shortener. This came out of a personal need: my other project, getwaitlist.com , uses lots of referral links, and I wasn't convinced by any of the other link-shortening solutions out there. So I thought I would make my own, and perhaps it'd be useful to other folks like me.
I built a best-in-class link shortener with all the bells and whistles: from extensive customization to really good traffic analytics. It was everything I needed. As for many other link shorteners, I put the "Shorten Link" interface front-and-center on the home page, no signup required. I made usage free and unlimited, figuring that free is always the best marketing strategy. I quietly released it and began promoting it. I quietly released it and began promoting it.

Scams from Day One

Malware came on day one. Some of the very earliest links that were submitted were obviously phishing links: fake login pages for Microsoft Online, made in free website builders. It dawned on me that of all the people most incentivized to find new free link shorteners are cybercriminals who keep getting banned from the other ones! They need link shorteners to obscure their traffic and make it harder to block their malicious pages.
I tried submitting some of those same links to other free link shorteners like Bitly and TinyURL to see how they were dealing with them. It became clear reasonably quickly that they had some internal blacklists that disqualified certain domains or types of websites, while other malicious sites were getting through. I reasoned that I would just have to keep an eye on it myself. People were usually submitting fewer than 100 links a day, so it was pretty easy for me to take 20 minutes and quickly check them out. I made a basic dashboard for monitoring the links that were getting submitted, and then could mark them as legitimate or banned.
Ban Dashboard

Trustworthiness at the Domain Level

Many of the links were always on the same domains, so I created some whitelisting and blacklisting logic. For example, a https://youtube.com/ link is always going to be legit. On the other hand, I could see that certain domains were always a scam. I remember a domain like my-online-store.shop where each subdomain like etsy.my-online-store.shop or shopify.my-online-store.shop would then be a separate phishing link. Cybercriminals have lots of domains, but not infinitely many (they do cost money!), so banning at the domain-level was actually quite effective.
Ironically, I soon noticed that an easy way for criminals to get around domain-bans was to use other link shorteners to disguise their link (much in the same way that they were trying to use y.gy). There was an easy fix there: I compiled an open-source repository of link shortening URLs and added them to the auto-ban blacklist.
A big problem that came up at the domain level was what I'd call a trustworthy domain with untrustworthy subdomains , specifically where those subdomains represent user-generated content. I noticed that Hubside, Wix and Replit were frequently abused because of their generous free plans and seemingly lax security. Sometimes I'd wake up to 500 new links all from one subdomainXYZ.replit.com, which I'd ban promptly. (Just today, I banned a bunch of links from open-facebook.replit.app and google-official.replit.app. How Replit isn't catching this is beyond me.) By contrast, I never saw bad content hosted on Webflow or Squarespace, which seemed to have much higher hurdles to usage.

Trouble With More Sophisticated Criminals

With my simple domain-level whitelist/blacklist system, I had cut down on the overwhelming majority of spam. But some of it still managed to get through. One day, I received two emails from Amazon and Cloudflare (which I use for hosting) that malware had been found on my domain, and that it had been quarantined. They had been phishing pages that only appeared as such to users in certain geographies, e.g. if you accessed a particular website from a USA IP, it would look like a normal blog, but when you accessed it from a French IP, then it would be a phishing page for the mobile phone operator Orange.
There were trickle-down consequences to this: the firms that reported the security issue to AWS/Cloudflare had also reported it automatically to various other anti-virus companies. Now, some legitimate users were reporting that they couldn't access y.gy links because they were blocked by a corporate firewall.
My next step was to up the stakes for my adversaries. Instead of quietly deleting malicious links, I now redirected them to a Scam Warning page. And I figured out a trick for detecting geofenced content. Between the two of those, abuse dropped promptly. The Scam Warning in particular seemed effective at discouraging use, since it warns the end-user, the recipient of the scam, that they're being scammed, and will drive up their alertness and distrust whatever website linked them to it.
Anecdotally, most of these cybercriminals have a "signature" in that they only run one type of scam, the phishing pages look similar, the URLs even look similar, etc. Once I started forwarding their traffic to my Scam Warning page, I often saw those signatures disappear-- probably moving on to a link shortener that is less hostile to them and their business models.

Toward Peace

Now that I had managed to further cut down on spam and disincentivize it, the links that were being submitted where overwhelmingly legitimate. I filed support tickets with various anti-virus companies, and got y.gy unblocked. All was well. But even with big, proprietary whitelists and blacklists, it was still tedious to have to check manually the other links that were getting submitted. I still had concerns:
  1. Even more sophisticated criminals would host legitimate content on a webpage, use that to pass the initial check, and then change it to scam content later. In order to really prevent abuse, you have to perpetually re-scan all links.
  2. We are just too small for a commercial link-scanning solution at scale. Something like VirusTotal costs many thousands of dollars a month.
I knew that I didn't really have great solutions to these. And I believed that I had to have solutions to these problems, because it would mean that otherwise I would end up running into the same malware problems that would lead to my link shortener getting flagged by anti-virus software. The fundamental problem was that my API was available for free. Sure, I could require a signup and a CAPTCHA and an email verification, but that doesn't really deter sophisticated bad actors. Those are easy to bypass. The big, abuse-killing hurdle is actually requiring up-front payment. The reason why it's so effective is that payment processors like Stripe have the most aggressive incentives for banning abusive users. Anything that can wind up in a chargeback is big trouble for Stripe, and that means that any users of y.gy have already passed Stripe's aggressive fraud protection system. That's a good horse to hitch my wagon to.
Plus, I figured I've actually built a pretty good service, so maybe folks should be paying for it? A price as low as $4 a month is low enough that it won't deter the desirable users: those who actually want to get value out of the service and are using it to meet some goal, and not just incidentally. But it will deter spammers. So, I spent a weekend paywall-gating my APIs. Now the amount of traffic I see is a fraction of what I saw previously -- it's actually small enough to scan perpetually. It comfortably fits in the free- and low-cost tiers of commercial link scanners.

Does Paywalling Cure Everything?

No, we've still seen some users pay for our service and then upload malicious links. Whether they're using legitimately obtained (e.g. prepaid) credit cards or stolen credit cards, we don't know. But the way to think about all of this is like a funnel: first they have to be willing to pay, then they have to pass the payment processor's fraud prevention, then they have to enter a link that doesn't get automatically caught and banned, etc. At the end of the day, this funnel gets very thin, and it leaves very few links that we have to run through our malware scanners. This is now easy to manage.

Did You Leave any Free Features?

Yes, I left a free qr creator The reason for this is that the QR code, when generated, points directly at the destination URL, and doesn't get routed via the Y.GY servers, so it doesn't create an opportunity to disguise a link, and it's none of our concern. Previously we offered a QR code creator that made Y.GY links and had all the same customization and analytics of our short links -- but for most folks trying to make QR codes, that stuff doesn't really matter. They just want a QR code for their link. And we happily provide that.

What I Learned

There was a big and important lesson here for me: sometimes the dream of free is just too good for the practical world I went through a lot of trouble and spent many hours before simply killing free usage, which I should've maybe done from the start.
This experience with a link shortener made me appreciate how quickly and diligently cybercriminals will look for new solutions. Because they keep getting banned everywhere else, they will be the first users to find your new solution. And they will try their best to exploit it -- and usually a new project is easy to exploit because the founder hasn't done all the anti-abuse work that mature solutions have. This means that the hardest time for a link shortener or an email provider is on day one.
Speaking of email providers, I had previously drafted an email service to compete with overpriced providers like Sendgrid or Mailgun. There's some cosmetic work that I have to do before taking it live. Email is even more prone to abuse than link shortening. Email costs money to send, and is the primary vector of phishing and scams. Now that I've had this experience with a link shortener, I know what to expect from launching an email service. I certainly wouldn't permit any free usage, and I know to prepare for scammers and fraudsters. I'm glad that I had this learning experience and am not jumping first-thing into offering a new email provider -- I would almost certainly get overwhelmed by the problems.

On the Non-Obviousness of Software Usage

The reason why I'm writing this blog is because I had evidently been naive about how my service would be used once I put it out into the world. Builders of software are often naive about this; people will use your software in ways you never even dreamed of.
By way of example of legitimate usage, when I created y.gy, I thought it would primarily be used by marketers or engineers for link tracking. But the reality was different: for example, I saw tons of links from SEO farms as they were creating backlink profiles. I was seeing tons of long links getting shortened -- not for marketing, but just because the links were long! I noticed quickly that I was getting a lot of international traffic: for example, some people were shortening links to lots of arabic-language YouTube videos. There's a group of Peruvian newspapers that kept shortening links to all their articles. Some folks were shortening hundreds of links to online playlists, and I still don't know why. I didn't have any expectations when I started this project, so that was interesting to me!

Get started today.No credit card required.

Link shortening for everyone. Get started with a powerful, yet affordable option to start tracking your marketing efforts.

Application Screenshot Demo