Use XPath to locate email addresses from Mailto links on any website
FollowMailto links are used to redirect to an email address instead of a web page URL. When a user clicks on the Mailto link, the default email client on the visitor's computer opens and suggests sending a message to the email address mentioned in the Mailto link.
If a website contains mailto links, it is possible to scrape email addresses from it regardless of the website structure.
A standard mailto link looks like this in the HTML source code:
<a href="mailto:email@example.com">Send Email</a>
<a href="mailto:email@example.com, secondemail@example.com">Send Email</a>
SO the XPath below may work wonders in some cases:
//a[contains(@href,'mailto')]
Note that this trick only apply to mailto hyperlinks like this:
If you have further issues with the task or have a suggestion that would make this a better resource for you, we’d love to hear about it. Submit a request here.