Having trouble getting next page in Scrapy

Question:

I am learning to use scrapy and am building a simple crawler to reinforce what I am learning, and am attempting to get the next page link but am having trouble. Can anyone point me in the right direction of getting the next page link, which is located in the a of the final li

The pagination div is as follows:

<div class="pagination pagination-small hidden-phone">
    <ul>
        <li><a href="./viewforum.php?f=399&amp;start=40" data-original-title="" title=""><i
                class="icon-chevron-left"></i></a></li>
        <li><a href="./viewforum.php?f=399" data-original-title="" title="">1</a></li>
        <span class="page-sep">, </span>
        <li><a href="./viewforum.php?f=399&amp;start=40" data-original-title="" title="">2</a></li>
        <span class="page-sep">, </span>
        <li class="active"><a data-original-title="" title="">3</a></li>
        <span class="page-sep">, </span>
        <li><a href="./viewforum.php?f=399&amp;start=120" data-original-title="" title="">4</a></li>
        <span class="page-sep">, </span>
        <li><a href="./viewforum.php?f=399&amp;start=160" data-original-title="" title="">5</a></li>
        <span class="page-sep">, </span>
        <li><a href="./viewforum.php?f=399&amp;start=200" data-original-title="" title="">6</a></li>
        <li class="active"><a class="pointer-fix" href="#" onclick="jumpto(); return false;" title=""
                              data-original-title="Jump to page"> ... </a></li>
        <li><a href="./viewforum.php?f=399&amp;start=311244" data-original-title="" title="">10012</a></li>
        <li><a href="./viewforum.php?f=399&amp;start=120" data-original-title="" title=""><i
                class="icon-chevron-right"></i></a></li>
    </ul>
</div>

I have tried different variations of the following, but get the wrong li returned, it still gives me the class=active li even though I used li:not([class="page-sep, active"]):
response.css('div.pagination.pagination-small.hidden-phone').css('li:not([class="page-sep, active"])').get()

example:

>>> response.css('div.pagination.pagination-small.hidden-phone').css('li:not([class="active, page-sep"])').get()
'<li class="active"><a>1</a></li>'

Thanks

Asked By: user19782805

||

Answers:

Since it’s the last li on the list we can use this to out advantage.

css:

In [1]: response.css('div.pagination li:last-child a::attr(href)').get()
Out[1]: './viewforum.php?f=399&start=120'

xpath:

In [2]: response.xpath('//div[contains(@class, "pagination")]//li[last()]/a/@href').get()
Out[2]: './viewforum.php?f=399&start=120'
Answered By: SuperUser