Take you hands-on with Python crawlers (four, ORM and peewee)

1. Why use ORM

  • Isolate the differences between database and database version
  • Easy to maintain
  • ORM will provide anti-sql injection and other functions
  • Variable passing call is easier
  • ORM is becoming more and more popular

Second, the choice of ORM

frame advantage Disadvantage
peewee Django-style API makes it easy to use and lightweight, and easy to integrate with any web framework Does not support automated schema migration. Many-to-many queries are not intuitive to write
SQLObject Uses an easy-to-understand ActiveRecord mode; a relatively small code base The naming of methods and classes follows Java's little hump style; database session isolation unit of work is not supported
Storm Refreshing and lightweight API, short learning curve and long-term maintainability; no special class constructors, no necessary base classes Forcing programmers to manually write the DDL statements created by the table, instead of automatically deriving from the model class; Storm contributors must give Canonical the copyright of their contributions
Django’s ORM Easy to use, short learning curve; tightly integrated with Django, and use conventional methods to operate the database when using Django It is difficult to handle complex queries, forcing developers to return to native SQL; tight integration with Django makes it difficult to use outside the Django environment
SQLAlchemy Enterprise-level API makes the code robust and adaptable; flexible design makes it easy to write complex queries The concept of unit of work is not common; heavyweight API, leading to a long learning curve

We chose peeweethis framework to learn because it is simple, flexible, and the declaration method is close to django's ORM. And the number of stars is high, the activity is high, and the document quality is high.

Official document: http://docs.peewee-orm.com/en/latest/

Three, peewee use

1. Installation

Switch to the virtual environment and install

pip install peewee

2. Create and use

from peewee import *

db = MySQLDatabase("py_spider", host="localhost", port=3307, user="root", password="root")


class Person(Model):
    name = CharField()
    birthday = DateField()

    class Meta:
        database = db  # This model uses the "people.db" database.


if __name__ == "__main__":
    db.create_tables([Person])  # 根据模型创建数据表

The generated data table, the table name defaults to the class name, and an ID field (primary key) is added by default:
Insert picture description here

Field types table (database and model field correspondence table)

Field Type Sqlite Postgresql MySQL
AutoField integer serial integer
BigAutoField integer bigserial bigint
IntegerField integer integer integer
BigIntegerField integer bigint bigint
SmallIntegerField integer smallint smallint
IdentityField not supported int identity not supported
FloatField real real real
DoubleField real double precision double precision
DecimalField decimal numeric numeric
CharField varchar varchar varchar
FixedCharField char char char
TextField text text text
BlobField blob bytea blob
BitField integer bigint bigint
BigBitField blob bytea blob
UUIDField text uuid varchar(40)
BinaryUUIDField blob bytea varbinary(16)
DateTimeField datetime timestamp datetime
DateField date date date
TimeField time time time
TimestampField integer integer integer
IPField integer bigint bigint
BooleanField integer boolean bool
BareField untyped not supported not supported
ForeignKeyField integer integer integer

3. Add, delete, check and modify

(1) Add

if __name__ == "__main__":
    # db.create_tables([Person])  # 创建数据表
    from datetime import date
    # 生成数据
    bob = Person(name="Bob", birthday=date(2020, 12, 12))
    # 新增数据到数据库
    bob.save()

Insert picture description here

(2) Query data

if __name__ == "__main__":
    # 只查询一条数据  get方法在取不到数据会抛出异常,需try catch
    Bob = Person.select().where(Person.name == 'Bob').get()
    print(Bob.name)     # Bob
    print(Bob.birthday)     # 2020-12-12
    # 同上
    Bob2 = Person.get(Person.name == 'Bob')
    print(Bob2.name)    # Bob
    print(Bob2.birthday)    # 2020-12-12

    # 查询多条数据
    Bobs = Person.select().where(Person.name == 'Bob')
    for b in Bobs:
        print(b.name)
        print(b.birthday)

(3) Modify data

if __name__ == "__main__":
    from datetime import date
    # 修改数据
    Bobs = Person.select().where(Person.name == 'Bob')
    for b in Bobs:
        b.birthday = date(1997, 10, 16)
        b.save() # 在没有数据的时候新增,存在的时候修改

Insert picture description here

(4) Delete data

if __name__ == "__main__":
    # 删除数据
    Bobs = Person.select().where(Person.name == 'Bob')
    for b in Bobs:
        b.delete_instance()
        

Insert picture description here

Guess you like

Origin blog.csdn.net/zy1281539626/article/details/111243785