Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
1.0k views
in Technique[技术] by (71.8m points)

regex to parse import statements in python

can someone help me writing single regex to get module(s) from python source line?

from abc.lmn import pqr
from abc.lmn import pqr as xyz
import abc
import abc as xyz

it has 3 sub parts in it

[from(s)<module>(s)] --> get module if this part exist
import(s)<module>     --> get module
[(s)as(s)<alias>]    --> ignore if this part exist

something like this

:?[from(s)<module>(s)]import(s)<module>:?[(s)as(s)<alias>]
See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Answer

0 votes
by (71.8m points)

Instead of using a regex, using the built in python library ast might be a better approach. https://docs.python.org/2/library/ast.html You can use it to parse python syntax.

import ast

import_string = """from abc.lmn import pqr
from abc.lmn import pqr as xyz
import abc
import abc as xyz"""

modules = []
for node in ast.iter_child_nodes(ast.parse(import_string)):
    if isinstance(node, ast.ImportFrom):
        if not node.names[0].asname:  # excluding the 'as' part of import
            modules.append(node.module)
    elif isinstance(node, ast.Import): # excluding the 'as' part of import
        if not node.names[0].asname:
            modules.append(node.names[0].name)

that will give you ['abc.lmn', 'abc'] and it is fairly easy to tweak if you want to pull other information.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

2.1m questions

2.1m answers

60 comments

57.0k users

...