Python detects its own memory footprint in real time

Recently doing text statistics,Implemented in Python,Encountered an interesting problem-how to save statistical results。

Writing directly into memory is really impossible,Memory exhausted after ten hours,The program was forced to close。If you write directly to the database,Each write is too slow,It's been more than ten hours,If you go on like this, you will have to count on the week,Not a solution。

At last,I thought of a solution that does both[……]

Click link to continue reading...

Proper use PIP to install Python package to avoid TypeError: ‘module’ object is not callable

Before,I have been so install and use pip on the macOS:

later,This method is ineffective,Become such:

Finally one day,I need to update it prompted pip,then:

Click link to continue reading...

Let go pip Agent

When using python,Often you need to download some third-party framework,Fortunately, python has a similar apt package management tool,pip。

but,Although management packs can pip,But no source switching mirror,And we downloaded package,Most of the large foreign code hosting server,This often leads to a few hundred KB of package to download one hour。


Click link to continue reading...

Using python write a domain whitelist reptiles

Some time ago I wrote an article,SayIt is time to use the whitelist to the wallThe,But that has expired long whitelist,It is not so smooth with the,Then I say boasted:I want to implement your own a reptile,Come crawling China's domain,Good update whitelist。

Ok,All in all this crawler is written on the line and then climb took more than ten thousand,But in the end I found the former to do a better solution,So the project was abandoned reptile[……]

Click link to continue reading...