Extract data from a paginated API with tRESTClient

I need to crawl a paginated REST API with offset and limit parameters using Talend. The API gives me a list of the resources I am interested in.

For instance, the response to the initial request with offset=0 and limit=2 is:

{

    "meta": {
        "limit": 2,
        "next": "/api/v1/request/?offset=2&limit=2",
        "offset": 0,
        "previous": null,
        "total_count": 4300
    },
    "objects": [
        {
            "id": 1,
            "name": "foo"
        },
        {
            "id": 2,
            "name": "bar"
        }
    ]
}

As you can see, the response object contains an objects key, i.e. some of the desired resources and a meta key which indicates the next URL to query: next. So far I am able to perform the initial request with tRESTClient. However, I don't know how to proceed from here and request the remaining pages using the clue given by next.

How can I perform multiple requests to that API so that I iterate over the whole list until next equals null (=list is exhausted)?

I tried to figure out how tSetGlobalVar and tLoop could help me, but so far with no success. But then again, I am a Talend newbie.

Current job

This is what my job currently looks like:

Screenshot of the current job



Read more here: https://stackoverflow.com/questions/18636493/extract-data-from-a-paginated-api-with-trestclient

Content Attribution

This content was originally published by cyroxx at Recent Questions - Stack Overflow, and is syndicated here via their RSS feed. You can read the original post over there.

%d bloggers like this: