What is web scraping ?

index.html

When user provide any URL details in text-box field and hit the enter button, then it will send the (Ajax)request to " fetch_url.php " page with the help of jQuery script as Keyup event listener attached to text- box field. After that it will extract meta tag details from the " fetch_url.php " page without any page refresh. manipulating the meta tag details and displaying the results HTML format.

Below code helps Facebook Like Extracting URL Data with Jquery and Ajax.

<!DOCTYPE html> <html> <head> <style> .container, #url { width: 500px; border: 1px solid #d6d7da; padding: 0px 5px 5px 5px; border-radius: 5px;font-family: arial; color: #333333; font-size: 14px; background: #ffffff;rgba(200,200,200,0.7) 0 4px 10px -1px; margin: 0px auto; float:left; clear: both; margin-top:10px;; } </style> <script type= "text/javascript" src= "jquery-3.2.1.min.js" ></script> <script type= "text/javascript" > $( document ).ready(function() { $( "#url" ).keyup(function() { var val=document.getElementById( "url" ).value; if (val!= "" && val.indexOf( "://" )>-1) { $( '#loading' ).text( 'Loading...' ); $( '.container' ).hide(); $.ajax( { type: 'post' , url: 'fetch_url.php' , data: { link:val } , cache: false, success:function(response) { $( '#loading' ).text( '' ); $( '.container' ).show(); $( '.container' ).html(response); } } ); } } ); } ); </script> </head> <body> <h1>Skptricks Extract URL Data Like Facebook Using PHP,jQuery And Ajax</h1> <div> <textarea id= "url" placeholder= "Enter Complete URL" ></textarea> <div id= "loading" style= "clear:both;" ></div> <div class= "container" style= "display:none;" ></div> </div> </body> </html>

fetch_url.php

<?php if (isset($_POST[ "link" ])) { $main_url=$_POST[ "link" ]; @$str = file_get_contents($main_url); // This Code Block is used to extract title if (strlen($str)>0) { $str = trim(preg_replace( '/\s+/' , ' ' , $str)); // supports line breaks inside <title> preg_match( "/\<title\>(.*)\<\/title\>/i" ,$str,$title); } // This Code block is used to extract description $b =$main_url; @$url = parse_url( $b ) ; @$tags = get_meta_tags( $main_url ); // This Code Block is used to extract og:image which facebook extracts from webpage it is also considered // the default image of the webpage $d = new DomDocument(); @$d->loadHTML($str); $xp = new domxpath($d); foreach ($xp->query( "//meta[@property='og:image']" ) as $el) { $l2=parse_url($el->getAttribute( "content" )); if ($l2[ 'scheme' ]) { $img[]=$el->getAttribute( "content" ); // print_r($img2); } else { } } } ?> <a href= "<?php echo $main_url;?>" style= "text-decoration: none;" target= "_blank" > <?php if (!empty($img)) { echo "<img style='max-height:100%; max-width:100%;' src='" .$img[0]. "'><br>" ; } echo "<br><H2 id='title' >" .$title[1]. "</H2>" ; echo "<p id='desc'>" .$tags[ 'description' ]. "</p>" ; ?> </a>

This post explains. Basically in thewe are retrieving thefrom the webpages, which is available in the Head Tag.From the Meta Tag we are picking up some important information like Title, Description, Images and some URL. If you have observed the most the company like Facebook, Twitter, Google, LinkedIn etc using thisin their webpages. In another word this extraction technique is also called asto an application that processes the HTML of afor manipulation such as converting the Web page to another format (i.e. HTML to WML).Here in this post we have provided a simpleand also it gives you ideas how to get the cross domain data with jquery and ajax.lets see the below source code and clear you understanding step by step.In this page we have created thefield, where we are providing the URL details for the link extraction. we have divided the extraction process in simple steps, which is as follows :This source code helps to extract information form the webpage based on the request link. This helps to extract information like Title of Page, Page description and image from webpages.This is all about the. Any case of any query/suggestion please do comment below.