Getting URL parameter in java and extract a specific text from that URL

I think the one of the easiest ways out would be to parse the string returned by URL.getQuery() as public static Map<String, String> getQueryMap(String query) { String[] params = query.split(“&”); Map<String, String> map = new HashMap<String, String>(); for (String param : params) { String name = param.split(“=”)[0]; String value = param.split(“=”)[1]; map.put(name, value); } return … Read more

Extracting text from a PDF file using PDFMiner in python?

Here is a working example of extracting text from a PDF file using the current version of PDFMiner(September 2016) from pdfminer.pdfinterp import PDFResourceManager, PDFPageInterpreter from pdfminer.converter import TextConverter from pdfminer.layout import LAParams from pdfminer.pdfpage import PDFPage from io import StringIO def convert_pdf_to_txt(path): rsrcmgr = PDFResourceManager() retstr = StringIO() codec=”utf-8″ laparams = LAParams() device = TextConverter(rsrcmgr, … Read more

How to extract a substring using regex

Assuming you want the part between single quotes, use this regular expression with a Matcher: “‘(.*?)'” Example: String mydata = “some string with ‘the data i want’ inside”; Pattern pattern = Pattern.compile(“‘(.*?)'”); Matcher matcher = pattern.matcher(mydata); if (matcher.find()) { System.out.println(matcher.group(1)); } Result: the data i want

Extract email id from text file by showing path

Something like the following will work: with open(‘resume.txt’, ‘r’) as f_input: print re.findall(r’\b([a-z0-9-_.]+?@[a-z0-9-_.]+)\b’, f_input.read(), re.I) It will display: [‘[email protected]’] But the exact logic can be far more complicated. It all depends on how accurate it needs to be. This will display all email addresses in the text file, just in case there is more than … Read more