Parser submodule¶
-
class
marcxml_parser.parser.
MARCXMLParser
(xml=None, resort=True)[source]¶ Bases:
object
This class parses everything between
<root>
elements. It checks, if there is root element, so please, give it full XML.controlfields
is simple dictionary, where keys are field identificators (string, 3 chars). Value is always string.datafields
is little more complicated; it is dictionary made of arrays of dictionaries, which consists of arrays ofMARCSubrecord
objects and two special parameters.It sounds horrible, but it is not that hard to understand:
.datafields = { "011": ["ind1": " ", "ind2": " "] # array of 0 or more dicts "012": [ { "a": ["a) subsection value"], "b": ["b) subsection value"], "ind1": " ", "ind2": " " }, { "a": [ "multiple values in a) subsections are possible!", "another value in a) subsection" ], "c": [ "subsection identificator is always one character long" ], "ind1": " ", "ind2": " " } ] }
-
leader
¶ string – Leader of MARC XML document.
-
oai_marc
¶ bool – True/False, depending if doc is OAI doc or not
-
controlfields
¶ dict – Controlfields stored in dict.
-
datafields
¶ dict of arrays of dict of arrays of strings – Datafileds stored in nested dicts/arrays.
Constructor.
Parameters: - xml (str/file, default None) – XML to be parsed. May be file-like object.
- resort (bool, default True) – Sort the output alphabetically?
-
add_ctl_field
(name, value)[source]¶ Add new control field value with under name into control field dictionary
controlfields
.
-
add_data_field
(name, i1, i2, subfields_dict)[source]¶ Add new datafield into
datafields
and take care of OAI MARC differencies.Parameters: - name (str) – Name of datafield.
- i1 (char) – Value of i1/ind1 parameter.
- i2 (char) – Value of i2/ind2 parameter.
- subfields_dict (dict) – Dictionary containing subfields (as list).
subfields_dict is expected to be in this format:
{ "field_id": ["subfield data",], ... "z": ["X0456b"] }
Warning
For your own good, use OrderedDict for subfields_dict, or constructor’s resort parameter set to
True
(it is by default).Warning
field_id
can be only one character long!
-
get_i_name
(num, is_oai=None)[source]¶ This method is used mainly internally, but it can be handy if you work with with raw MARC XML object and not using getters.
Parameters: - num (int) – Which indicator you need (1/2).
- is_oai (bool/None) – If None,
oai_marc
is used.
Returns: current name of
i1
/ind1
parameter based onoai_marc
property.Return type: str
-
i1_name
¶ Property getter / alias for
self.get_i_name(1)
.
-
i2_name
¶ Property getter / alias for
self.get_i_name(2)
.
-
get_ctl_field
(controlfield, alt=None)[source]¶ Method wrapper over
controlfields
dictionary.Parameters: - controlfield (str) – Name of the controlfield.
- alt (object, default None) – Alternative value of the controlfield when controlfield couldn’t be found.
Returns: record from given controlfield
Return type: str
-
getDataRecords
(datafield, subfield, throw_exceptions=True)[source]¶ Deprecated since version Use:
get_subfields()
instead.
-
get_subfields
(datafield, subfield, i1=None, i2=None, exception=False)[source]¶ Return content of given subfield in datafield.
Parameters: - datafield (str) – Section name (for example “001”, “100”, “700”).
- subfield (str) – Subfield name (for example “a”, “1”, etc..).
- i1 (str, default None) – Optional i1/ind1 parameter value, which will be used for search.
- i2 (str, default None) – Optional i2/ind2 parameter value, which will be used for search.
- exception (bool) – If
True
,KeyError
is raised when method couldn’t found given datafield / subfield. IfFalse
, blank array[]
is returned.
Returns: of
MARCSubrecord
.Return type: list
Raises: KeyError
– If the subfield or datafield couldn’t be found.Note
MARCSubrecord is practically same thing as string, but has defined
MARCSubrecord.i1()
andMARCSubrecord.i2
methods.You may need to be able to get this, because MARC XML depends on i/ind parameters from time to time (names of authors for example).
-