Original Post
So I wanted to build a link parser like the one on Facebook, but didn’t find one that suited me. So I built one. My code is based off the code found here, but I rewrote much of it to be cleaner and to return JSON rather than HTML.
Code Change – Feb 2nd, 2011
Added some refinements to the cleaning mechanism and greatly speed up image parser.
HTML
<script src="http://ajax.googleapis.com/ajax/libs/jquery/1.4.4/jquery.min.js"></script> <style> #atc_bar{width:500px;} #attach_content{border:1px solid #ccc;padding:10px;margin-top:10px;} #atc_images {width:100px;height:120px;overflow:hidden;float:left;} #atc_info {width:350px;float:left;height:100px;text-align:left; padding:10px;} #atc_title {font-size:14px;display:block;} #atc_url {font-size:10px;display:block;} #atc_desc {font-size:12px;} #atc_total_image_nav{float:left;padding-left:20px} #atc_total_images_info{float:left;padding:4px 10px;font-size:12px;} </style> <br /><br /><br /><br /> <div align="center"> <h1>Parse a Link Like Facebook with PHP and Jquery</h1> <div id="atc_bar" align="center"> Paste Link Here: <input type="text" name="url" size="40" id="url" value="" /> <input type="button" name="attach" value="Parse" id="attach" /> <input type="hidden" name="cur_image" id="cur_image" /> <div id="loader"> <div align="center" id="atc_loading" style="display:none"><img src="load.gif" alt="Loading" /></div> <div id="attach_content" style="display:none"> <div id="atc_images"></div> <div id="atc_info"> <label id="atc_title"></label> <label id="atc_url"></label> <br clear="all" /> <label id="atc_desc"></label> <br clear="all" /> </div> <div id="atc_total_image_nav" > <a href="#" id="prev"><img src="prev.png" alt="Prev" border="0" /></a><a href="#" id="next"><img src="next.png" alt="Next" border="0" /></a> </div> <div id="atc_total_images_info" > Showing <span id="cur_image_num">1</span> of <span id="atc_total_images">1</span> images </div> <br clear="all" /> </div> </div> <br clear="all" /> </div> </div>
JavaScript
<script> $(document).ready(function(){ // delete event $('#attach').bind("click", parse_link); function parse_link () { if(!isValidURL($('#url').val())) { alert('Please enter a valid url.'); return false; } else { $('#atc_loading').show(); $('#atc_url').html($('#url').val()); $.post("fetch.php?url="+escape($('#url').val()), {}, function(response){ //Set Content $('#atc_title').html(response.title); $('#atc_desc').html(response.description); $('#atc_price').html(response.price); $('#atc_total_images').html(response.total_images); $('#atc_images').html(' '); $.each(response.images, function (a, b) { $('#atc_images').append('<img src="'+b.img+'" width="100" id="'+(a+1)+'">'); }); $('#atc_images img').hide(); //Flip Viewable Content $('#attach_content').fadeIn('slow'); $('#atc_loading').hide(); //Show first image $('img#1').fadeIn(); $('#cur_image').val(1); $('#cur_image_num').html(1); // next image $('#next').unbind('click'); $('#next').bind("click", function(){ var total_images = parseInt($('#atc_total_images').html()); if (total_images > 0) { var index = $('#cur_image').val(); $('img#'+index).hide(); if(index < total_images) { new_index = parseInt(index)+parseInt(1); } else { new_index = 1; } $('#cur_image').val(new_index); $('#cur_image_num').html(new_index); $('img#'+new_index).show(); } }); // prev image $('#prev').unbind('click'); $('#prev').bind("click", function(){ var total_images = parseInt($('#atc_total_images').html()); if (total_images > 0) { var index = $('#cur_image').val(); $('img#'+index).hide(); if(index > 1) { new_index = parseInt(index)-parseInt(1);; } else { new_index = total_images; } $('#cur_image').val(new_index); $('#cur_image_num').html(new_index); $('img#'+new_index).show(); } }); }); } }; }); function isValidURL(url) { var RegExp = /(ftp|http|https):\/\/(\w+:{0,1}\w*@)?(\S+)(:[0-9]+)?(\/|\/([\w#!:.?+=&%@!\-\/]))?/; if(RegExp.test(url)){ return true; }else{ return false; } } </script>
PHP
$url = urldecode($_REQUEST['url']); $url = checkValues($url); $return_array = array(); $base_url = substr($url,0, strpos($url, "/",8)); $relative_url = substr($url,0, strrpos($url, "/")+1); // Get Data $cc = new cURL(); $string = $cc->get($url); $string = str_replace(array("\n","\r","\t",'</span>','</div>'), '', $string); $string = preg_replace('/(<(div|span)\s[^>]+\s?>)/', '', $string); if (mb_detect_encoding($string, "UTF-8") != "UTF-8") $string = utf8_encode($string); // Parse Title $nodes = extract_tags( $string, 'title' ); $return_array['title'] = trim($nodes[0]['contents']); // Parse Base $base_override = false; $base_regex = '/<base[^>]*'.'href=[\"|\'](.*)[\"|\']/Ui'; preg_match_all($base_regex, $string, $base_match, PREG_PATTERN_ORDER); if(strlen($base_match[1][0]) > 0) { $base_url = $base_match[1][0]; $base_override = true; } // Parse Description $return_array['description'] = ''; $nodes = extract_tags( $string, 'meta' ); foreach($nodes as $node) { if (strtolower($node['attributes']['name']) == 'description') $return_array['description'] = trim($node['attributes']['content']); } // Parse Images $images_array = extract_tags( $string, 'img' ); $images = array(); for ($i=0;$i<=sizeof($images_array);$i++) { $img = trim(@$images_array[$i]['attributes']['src']); $width = preg_replace("/[^0-9.]/", '', $images_array[$i]['attributes']['width']); $height = preg_replace("/[^0-9.]/", '', $images_array[$i]['attributes']['height']); $ext = trim(pathinfo($img, PATHINFO_EXTENSION)); if($img && $ext != 'gif') { if (substr($img,0,7) == 'http://') ; else if (substr($img,0,1) == '/' || $base_override) $img = $base_url . $img; else $img = $relative_url . $img; if ($width == '' && $height == '') { $details = @getimagesize($img); if(is_array($details)) { list($width, $height, $type, $attr) = $details; } } $width = intval($width); $height = intval($height); if ($width > 199 || $height > 199 ) { if ( (($width > 0 && $height > 0 && (($width / $height) < 3) && (($width / $height) > .2)) || ($width > 0 && $height == 0 && $width < 700) || ($width == 0 && $height > 0 && $height < 700) ) && strpos($img, 'logo') === false ) { $images[] = array("img" => $img, "width" => $width, "height" => $height, 'area' => ($width * $height),'offset' => $images_array[$i]['offset']); } } } } $return_array['images'] = array_values(($images)); $return_array['total_images'] = count($return_array['images']); header('Cache-Control: no-cache, must-revalidate'); header('Expires: Mon, 26 Jul 1997 05:00:00 GMT'); header('Content-type: application/json'); echo json_encode($return_array); exit;
thanx boss, i’ll use it to my site: catchfriend.com
thank you.
Very well done, useful stuff, thank you!
Great script. I found it useful. Is there a way to add the meta tags for FaceBook in this script? It would make it more like FB.
thanx for such a nice post. will use it on our next project
hey great script , i have one question ….can this script working from localhost for test ? or we have to send in the server for testing beacause localhost test is not working or i have internet ..thanks for reply and your advise …great job
This script will work for localhosts as well as remote machines. Of course, in order for it to work locally, you’ll need to install it on a server inside your intranet.
hey bacar, did you test on Windows machine ? if so, you should modify line 261 ” $cookie = ‘/tmp/cookies.txt’ ” to maybe ” $cookie = ‘./cookies.txt’ ” and VOILA . there you go 🙂
Hi,
I’m building a social network, but I’m really a newbie to PHP, Javscript… Newbie is not the right word.. I know nothing about it really… I’m using Joomla and social network solution called JomSocial. I really need link parsing though for when links are posted… I’d pay you to help me integrate your script into my site… I don’t know how.. I think it would be pretty easy for you. All of the PHP and JS files are nicely organized by Jom Social. Would you be interested?
Josh
And I’m open to help from anyone that sees this by the way..
Hello Josh,
Did you find a way to add it in jomsocial. I’m bored to ask jomsocial team to add that feature !!!
best regards
Pingback: Facebook style status sharing based on jQuery and Ruby on Rails « Brandon's Writes
What will make this rock is if it learns to read the Open Graph tags as well, given that is becoming an industry standard with Google also using it.
Very nice! How about parsing it within a textarea without having the click the “Parse” button?
Thank you boss. I am so grateful to you!
Brilliant. Just what I was looking for.
Can you write this PHP code in Java? actually this is what I was looking for, from last 2 days, but I am Java developer and i dont have knowledge of PHP, can you please write this in Java, if you know Java, or at least comment this code, on each line of what that line is doing, so that I can at least try to write myself.. reply me in my mailbox. thanks..
very nice tutorial, this what i want. . . thx
Hi,
great tutorial. It’s working great, but I have a problem loading this URL http://teamtreehouse.com/. $cURL->get($url); returns string ” “. Could you please check that?
Thanks for help in advance.
Hello,
I am getting an error with the cUrl class.
Where do you have your curl class defined?I have curl in php (i checked phpinfo and it is installed)
Best regards,
Joao Garin
The checkValues() function that you have called on line 2 of the php script where is it defined an what is its purpose?
Download the full zip file and it will be in there.
As to it’s purpose, I honestly have forgotten. I wrote this code over 3 years ago and like most of us looking back at our code, I am ashamed at the quality and lack of comments. It was simply a hack I did over the weekend and I wanted to share it with the community. I haven’t had the time to update it, but if you plan on using it, I encourage you to tweak it, slice it, make it better, then share it back with the community. I’ll try to get out an update as soon as my schedule opens a bit. Cheers.
Hello all,
Great code, thanks for sharing. I took the code and created a plugin attached to a class in PHP that does exactly the same. I made some small improvements and organized a bit.
It is available for all under the MIT license in https://github.com/maparrar/linkparser. You are invited to participate to improve it!
@ Tony: It would be great if you give me your opinion.
Cheers.
Great stuff and I’ll give it a look shortly. Thanks for posting this for the community.
Thanks for the nice code.
On my local server it works fine but not working on the live site. Does it have an apache version requirement?
Thanks,
Masood
It shouldn’t
Very good article. I’m facing many of these issues as well..
I took maparrar’s code and updated it a bit. I added link submit with enter key and more features.
https://github.com/FlameCore/Infernum
[I accidently posted the wrong link, now it’s correct]
I took maparrar’s code and updated it a bit. I added link submit with enter key and more features.
https://github.com/FlameCore/Flink