Skip to content

leopardd/UrlShortenerBundle

Repository files navigation

UrlShortenerBundle

Latest Version on Packagist Software License Build Status

Getting started

1. Install
$ composer require leopardd/url-shortener-bundle

2. Register bundle
<?php
// app/AppKernel.php

public function registerBundles()
{
    $bundles = [
        // ...
        new Leopardd\Bundle\UrlShortenerBundle\LeoparddUrlShortenerBundle(),
    ];
    
    // ...
}
  
3. (optional) Setup parameter

// config.yml
leopardd_url_shortener:
    hashids:
        salt: new salt
        min_length: 10
        alphabet: abcdefghijklmnopqrstuvwxyz1234567890

4. Update database schema
$ php app/console doctrine:schema:update --force

Folder structure

Controller
DependencyInjection
Entity..................represent table structure
Event
Exception
Factory.................create Entity instance
Repository..............all interaction with database
Resources
Service.................contain business logic

Feature & Update & Note

Algorithm

Get "short-url" process

  1. Insert "long-url" into database then return row-id
  2. Encode row-id then save it

Redirect "short-url" process

  1. Decode incoming "short-url" then we get row-id
  2. Return item in that row-id

Reason behind "row-id" approach

Short url: produce shortened version of url

  1. Generate: produce a shortened version of the URL submitted
  2. Lookup: when the shortened version is called, look up this reference in database then return it

And the challenge is

  1. Lookup time
  2. Allow very very large number of unique ids and at the same time
  3. Keep the ID length as small as possible
  4. ID should be sort of user friendly and possibly a memorable (if possible)
  5. Scale with multiple instances (Sharding)
  6. What happens when ID reach the maximum value e.g. (if the length is 7 containing [A-Z, a-z, 0-9], we can serve 62^7 (~35k billion) urls)
  7. Replication, database can be crashed by many problems, how to replicate instances, recover fast ?, keep read / write consistent ?

In this bundle, we will focus on point 1. How we can reduce loop-up time.

Table stucture
id: number
code: shortened version of long url
url: long url

Attempt 1
- Generate: random id
- Lookup: simple loop up (O(n))

Attempt 2
- Generate: hash function from long url
- Lookup: simple loop up (O(n))

Attempt 3
- Generate: hash function from long url
- Lookup: bloom filters

Attempt 4
- Generate: hash function from record id
- Lookup: decoding (O(1))

So, In this project decide to using "Attemp 4" by using Hashids for hash function (Hashids will generate unique code from difference id)

Reference && Tool

Algorithm