問題描述
我在一個網站上使用 JAVA Jsoup 庫來提取一些超鏈接
I work on a website with JAVA Jsoup Library to extract some hyperlinks
Document doc = Jsoup.connect("http://www.saudisale.com/SS_a_mpg.aspx").get();
Elements script = doc.select("script") ;
for(Element elementary :doc.select("table"))
{
System.out.println(""+elementary.select("tbody").select("tr").select("td").select("input").attr("onClick")+"");
樣本輸出:-
window.open('http://saudisale.com/arPrivatePage.aspx?id=21871638','_blank','channelmode =1,scrollbars=1,status=0,titlebar=0,toolbar=0,resizable=1');
window.open('http://saudisale.com/arPrivatePage.aspx?id=21871638','_blank','channelmode =1,scrollbars=1,status=0,titlebar=0,toolbar=0,resizable=1');
window.open('http://saudisale.com/arPrivatePage.aspx?id=21871638','_blank','channelmode =1,scrollbars=1,status=0,titlebar=0,toolbar=0,resizable=1');
window.open('http://ads.saudisale.com/dyaralez.html ','_blank','channelmode =1,scrollbars=1,status=0,titlebar=0,toolbar=0,resizable=1');
window.open('http://ads.saudisale.com/dyaralez.html ','_blank','channelmode =1,scrollbars=1,status=0,titlebar=0,toolbar=0,resizable=1');
window.open('http://ads.saudisale.com/dalel.html','_blank','channelmode =1,scrollbars=1,status=0,titlebar=0,toolbar=0,resizable=1');
window.open('http://ads.saudisale.com/dalel.html','_blank','channelmode =1,scrollbars=1,status=0,titlebar=0,toolbar=0,resizable=1');
window.open('SS_a_car.aspx?carid=37240','_blank','channelmode =1,scrollbars=1,status=0,titlebar=0,toolbar=0,resizable=1');
window.open('SS_a_car.aspx?carid=37240','_blank','channelmode =1,scrollbars=1,status=0,titlebar=0,toolbar=0,resizable=1');
基于 Jsoup 不支持 javascript,所以我必須做一些手動 java 代碼來將 window.open(hyperlink) javascript 代碼轉換為絕對超鏈接
Based on the fact that Jsoup does not support javascript, so I have to do some manual java code to convert window.open(hyperlink ) javascript code to absolute hyperlink
例如下面的輸出 JavaScript 代碼必須被轉換
For example the following output JavaScript code has to be converted
window.open('http://saudisale.com/arPrivatePage.aspx?id=21871638','_blank','channelmode=1,scrollbars=1,status=0,titlebar=0,toolbar=0,resizable=1')
到:http://saudisale.com/arPrivatePage.aspx?id=21871638
和
window.open('SS_a_car.aspx?carid=37149','_blank','channelmode =1,scrollbars=1,status=0,titlebar=0,toolbar=0,resizable=1');
到http://www.saudisale.com/SS_a_car.aspx?carid=37149
有人可以指導我如何使用 JAVA 完成這項任務嗎?
Could someone guide me how to accomplish this task with JAVA?
推薦答案
使用正則表達式.這會做你想做的事:
Use a regex. This will do what you want:
String input = "window.open('http://saudisale.com/arPrivatePage.aspx?id=21871638','_blank','channelmode =1,scrollbars=1,status=0,titlebar=0,toolbar=0,resizable=1');";
String regex = "window.open\(['"]*(.*?)(\s*['"]*,.*?)";
Pattern pattern = Pattern.compile(regex);
Matcher matcher = pattern.matcher(input);
while (matcher.find()) {
String output = (matcher.group().replaceAll(regex, "$1"));
System.out.println(output);
}
您的最后兩個網址是相對,因此您必須將它們轉換為絕對網址,如 這里.
Your last two URLs are relative, so you have to convert them to absolute URLs as described here.
這篇關于使用 JAVA 將 window.open(Hyperlink) Javascript 代碼轉換為純絕對 url的文章就介紹到這了,希望我們推薦的答案對大家有所幫助,也希望大家多多支持html5模板網!