I have a list of different websites - I would like to scrape the contact email from each individual website (if they have one)

Comments

1 comment

  • Scarlett

    1. Create a new task and save a list of URLs: Extract data from a list of URLs
    We can extend the timeout as the URL would take the time to be redirected. If you can generate the direct URL to create the task, that would be better.



    2. Find the contact us button and choose to click the link:


    3. Extract the email text:


    4. Modify the XPath of the email filed to //a[contains(@href,'mailto')]
    The XPath would only work for email text with a hyperlink to that email address.


    5. Modify the click item XPath: //a[contains(@href,'contact')]





    There will be some websites that do not have the Contact us link on the home page. And also some sites may not list the email address as text with a hyperlink but with an image. That will not be scraped either.

    0
    Comment actions Permalink

Please sign in to leave a comment.