The "Combine data" feature can be used to easily combine data extracted to different rows into ONE single row.
Let's suppose you need to extract posts from a blog. In some cases, you might not be able to select the entire post to extract, but you still want the whole post in one single row instead of having different paragraphs in different rows like this:
This is the perfect time to take advantage of the "Combine data" feature for merging the extracted data into one single row of data. Let's see how to get this done with an example.
Here we use the blog content from https://philipyancey.com/a-view-from-abroad to demonstrate.
1) Select the desired data to extract
1. Click on the first paragraph on the page and choose "Select all" on the Tips panel. A “Loop Item" will be created to extract every paragraph of the post.
2. Select "Extract text of the selected elements"
2) Combine the extracted data
1. Double click on the "Extract Data" action to open the settings panel
2. Click on , choose "Combine data", and select "Combine the captured data"
You are all set! Let's run the task and see what the actual exported data looks like. You can see that paragraphs captured in the "Text" field are now combined into a single row as one big chunk.
1. "Combine data" is especially useful for extracting articles from any website.
You can extract the article as one whole chunk with no other elements like blank lines, comments, or images.
2. When the data are conglomerated as one big chunk, you can further use Data reformat tools to add a prefix or suffix, such as "|" and "\" to reformat the data.
3. If there are multiple fields to extract, you would need to set up "Combine captured data" for every field.
Tutorial en español: Combinar datos extraídos
También puedes leer más tutoriales de web scraping en sitio web oficial