Chinese CLIP原理及实践(更新中)
目录
- CLIP简介
- CLIP使用(及改写)
- Inference(需要cn_clip包)
- CLIP 数据加载(待更新)
- CLIP 训练(待更新)
- CLIP Fine
- 预训练模型选择
- =
CLIP简介
git地址: Chinese-CLIP
模型下载链接: 模型规模 & 下载链接
作者知乎解读:
中文CLIP模型卷土重来,这次加量不加价!
CLIP使用(及改写)
Inference(需要cn_clip包)
# 计算emb函数
import torch
import numpy as np
from PIL import Image
import torchvision
import cn_clip
import cn_clip.clip as clip
from cn_clip.clip import load_from_name, available_models, load
print("Available models:", available_models())
device = "cuda" if torch.cuda.is_available() else "cpu"
print(device)
# model, preprocess = load_from_name("ViT-B-16", device=device, download_root='./pt_models')
model, preprocess = load_from_name("ViT-H-14", device=device, download_root='../02_CLIP/pt_models') #
model.eval()
print('finished')
from PIL import Image
from io import BytesIO
from urllib import request
def url2pil(img_url):
user_agent = 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_13_5) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/66.0.3359.181 Safari/537.36'
headers = {'user-agent': user_agent}
req = request.Request(url=img_url, headers=headers)
response = request.urlopen(req, timeout=30)
img= Image.open(BytesIO(response.read())).convert('RGB')
return img
def get_img_emb(url):
if not url: return ('1'+'***'+'url')
try:
img = url2pil(url)
except:
return ('2'+'***'+url)
image = url2pil(url)
with torch.no_grad():
image = preprocess(image).unsqueeze(0).to(device)
image_features = model.encode_image(image)
image_features /= image_features.norm(dim=-1, keepdim=True)
image_features = image_features.cpu().numpy()
emb = ','.join([str(x) for x in image_features.flatten()])
return "***".join(('0',url, emb))
#get_img_emb(url)
def get_txt_emb_batch(text):
text = clip.tokenize(text).to(device)
with torch.no_grad():
text_features = model.encode_text(text)
text_features = text_features.detach()
text_features /= text_features.norm(dim=-1, keepdim=True)
text_features = text_features.cpu().numpy()
return text_features
CLIP 数据加载(待更新)
CLIP 训练(待更新)
CLIP Fine
预训练模型选择
考虑到参数量、训练难度及训练时间,根据论文给出的参数量及Finetuning的performance,选择ViT-B/16
作为预训练模型:
=
■ \blacksquare ■
isii: 相关内容详见复变函数教材
isii: 提醒一下初学的童鞋,复数的导数、幂级数的定义会与实数有区别。这不是简单的代入关系,而是另一套法则下的运算。
CSDN-Ada助手: 哇, 你的文章质量真不错,值得学习!不过这么高质量的文章, 还值得进一步提升, 以下的改进点你可以参考下: (1)使用更多的站内链接;(2)提升标题与正文的相关性;(3)增加除了各种控件外,文章正文的字数。
weixin_52541085: 清晰明了
Delta9001: 麦克劳林那里sin和cos反了