Swift in the use cmph

cmph It stands for C Minimal Perfect Hashing Library ,It is a very well-known, written in C minimal perfect hash library,What is a perfect hash?

Perfect hashing

Here we do not speak principle,You just need to know that the traditional hash conflict,We need to rely on a variety of algorithms to deal with conflict can be,For hash,Always need a table,This table reserved many locations,Then the calculated value is the coordinates of these locations,You can put data into the corresponding coordinate in。

But this time there is a problem,If the table is not big enough in advance,Or that your algorithm is not good enough,The so-called aggregation occurs,A value is always an area of ​​conflict,The value of a certain area is empty,So,Your table is always greater than the number of key length to accommodate these spare。

But if your algorithm Niubi enough,Then you can do so that the table of length n can be equal to a minimum number of key!

When this reached,Then we call this hash functionMinimal perfect hash function


So this kind of thing really exist? of course。cmph Is a,Of course we want to use cmph Or a little premise,For example, you do not have to repeat the key,The key number must be fixed static。

For example, I Hidden Markov Model transition matrix,To meet such a requirement。

but,cmph with C language,We need to put C bridgingTo Swift in。


And almost bridging oc, You downloaded cmph directory src directory of the source files onto the Swift project can (remember to choosecopyNot a reference),If this is the first drag content in other languages,Xcode will automatically prompt you to create a bridge file,Then you have a project called the xxx-Bridging-Header.h document,Where xxx is the name of your project。

Write in it #include "cmph.h" Enough,So you put cmph functions exposed to the Swift。

Consolidation Project

Direct import cmph source code can not be used directly in the Swift project in,If you empty project compiler,Encounter "Multiple main function" error,Go cmph source file,delete main.c bm_numbers.c bm_bumbers.h Now try to compile again,If you have an error,Then delete the source code to contain the main function,We only use the function。


cmph fact, libraries and tool kits - which is why the source code contains the main function,We go to the download directory of cmph,Run the following command to compile:

This is where you can execute commands in a terminal cmph The:

Create a hash table

Use tool to create a hash table rather than code,Because it requires key is static,So be prepared in advance a list of key texts,Requirements are plain text documents,Line to a key,Maximum ten million also properly the completed。

Encoding is utf8

According to the command prompt,We executed directly in the directory cmph -g keys.txt .,If you want to specify the algorithm (specific algorithm can see the type and characteristicsOfficial website) For example, I am sure the database is used in the external memory,Do it with a special external memory optimization fast Algorithm,The command becomes so cmph -g -a fast keys.txt

Will be generated under the current directory after the implementation keys.txt.mph ,For convenience,I change it to keys.mph


To see the corresponding results,We can come to a text file,The same format as the key list format,Written inside the key you want to query,For example, I named Keyser.txt ,This time we export query results with debug mode:

Get results

Now,We can use the same method to all the key search again,Each key can be obtained corresponding to the coordinates!

Results obtained,We will be able to generate a corresponding file according to the position,I used to own encoding compiled on the occasion of a binary database,But according to location cmph The coordinates correspond,Such data and position can do a correspondence,Query speedO(1).

The query Swift

Now,We can generate mph Swift project files into a backup - and of course the order of the corresponding database file to find the。

Swift using a function in C,It is quite difficult。

First, you need to have a type of non-variable expression of type pointer to keep the hash table object:

In the initialization Lane,We used to load the hash table c function:

So the hash table is loaded successfully。

Suppose we want to check the value of a UInt32 Digital coding,So we need to put it into cmph function accepted char* pointer,You can make use of NSString to fulfill:

although Swift developer documentationSwift said the String method automatically maps the NSString,But apparently,It does not。

Such a Int id variable is the type of coordinates,Return Value Type Swift compiler automatically help you map。

to sum up

cmph principle is very complex,But this does not affect our use of it。It is worth mentioning that,It uses LGPL Authorize,This means that you are free to use it。

Use Swift perfectly bridged C language,Swift does not in itself but because developers are encouraged to do so leads to a bunch of "unsafe" looked kind of thrilling feeling。


anyShareshare to:

Leave a Reply

Your email address will not be published. Required fields are marked *