Popular websites implement a standard known as oEmbed (open embed). It's essentially an API they provide that allows you to pull information about an entity in a format more suited to displaying how you want, without having to scrape the HTML.
For example, to get info about any youtube video, you can hit the OEMBED API here:
http://www.youtube.com/oembed?url=YOUTUBE_URL&format=json
This returns JSON so you can display it however you like:
{
"title": "A Pep Talk from Kid President to You",
"type": "video",
"provider_url": "https://www.youtube.com/",
"thumbnail_height": 360,
"version": "1.0",
"author_url": "https://www.youtube.com/user/soulpancake",
"height": 270,
"html": "<iframe width=\"480\" height=\"270\" src=\"https://www.youtube.com/embed/l-gQLqv9f4o?feature=oembed\" frameborder=\"0\" allowfullscreen></iframe>",
"author_name": "SoulPancake",
"thumbnail_width": 480,
"thumbnail_url": "https://i.ytimg.com/vi/l-gQLqv9f4o/hqdefault.jpg",
"width": 480,
"provider_name": "YouTube"
}
You can see a shed-load of providers here:
Also, there's this package i stumbled across that looks quite neat:
https://github.com/oscarotero/Embed
Good luck!