pyeDSL - Embedded Domain Specific Language in python

 

 

The Signal class serves it purpose to get a raw data for a single signal. However, it makes it difficult to generalize in order to acquire the same signal across several circuits. In addition, typical processing requires to use additional classes and introduce even more local variables

circuit_type = 'RB'
circuit_name = 'RB.A12'
t_start = '2015-01-13 16:59:11+01:00'
t_end = '2015-01-13 17:15:46+01:00'
db = 'NXCALS'
system = 'PC'

metadata_pc = SignalMetadata.get_circuit_signal_database_metadata(circuit_type, circuit_name, system, db)
I_MEAS = SignalMetadata.get_signal_name(circuit_type, circuit_name, system, db, 'I_MEAS')

i_meas_nxcals_df = Signal().read(db, signal=I_MEAS, t_start=t_start, t_end=t_end, 
                                 nxcals_device=metadata_pc['device'], nxcals_property=metadata_pc['property'], nxcals_system=metadata_pc['system'], 
                                 spark=spark)

i_meas_nxcals_df = SignalUtilities.synchronize_df(i_meas_nxcals_df)
i_meas_nxcals_df = SignalUtilities.convert_indices_to_sec(i_meas_nxcals_df)

Several design flaws leading to inconsistency and code duplications:

  • use of multiple methods, multiple arguments (duplicated across methods)
  • multiple local variables (naming consistency across analysis modules)
  • order of methods and arguments (with duck typing) not fixed

The API also does not answer in a generic way the following questions

  • What if we want to get current for each circuit?
  • What if we want to get several current signals?

Natural languages have certain structure [1]

Language Word order Example
English: {Subject}.{Verb}.{Object}: John ate cake
Japanese: {Subject}.{Order}.{Verb}: John-ga keiki-o tabeta
- - John cake ate

One can enforce syntactical order in code:

  • Domain Specific Language – new language, requires parser
  • Embedded Domain Specific Language – extends existing language

Furthermore, an eDSL could be implemented following the Fluent interface approach [2]. The use of an eDSL for signal query and processing is not a new concept as there exists already an eDSL in Java used to automatically check signals during Hardware Commisionning campaigns of the LHC [3].

[1] K. Gulordava, Word order variation and dependency length minimisation: a cross-linguistic computational approach, PhD thesis, UniGe,
[2] https://en.wikipedia.org/wiki/Fluent_interface
[3] M. Audrain, et al. - Using a Java Embedded Domain-Specific Language for LHC Test Analysis, ICALEPCS2013, San Francisco, CA, USA

1. QueryBuilder()

 

We propose a python embedded Domain Specific Language (pyeDSL) for building queries:

  • General purpose query
    {DB}.{DURATION}.{QUERY_PARAMETERS}.{QUERY}

e.g.

df = QueryBuilder().{with_pm()/with_cals()/with_nxcals()}.with_duration().with_query_parameters()\
    .signal_query().dfs[0]
df = QueryBuilder().{with_pm()/with_cals()/with_nxcals()}.with_timestamp().with_query_parameters()\
    .signal_query().dfs[0]
  • Circuit-oriented query to generalize query across and within circuit types
    {DB}.{DURATION}.{CIRCUIT_TYPE}.{METADATA}.{QUERY}

e.g.

df = QueryBuilder().{with_pm()/with_cals()/with_nxcals()}.with_duration().with_circuit_type().with_metadata()\
    .signal_query().dfs[0]
df = QueryBuilder().{with_pm()/with_cals()/with_nxcals()}.with_timestamp().with_circuit_type().with_metadata()\
    .signal_query().dfs[0]
  • each parameter defined once (validation of input at each stage)
  • single local variable
  • order of operation is fixed
  • support for vector inputs
  • time-dependent metadata
 

The pyeDSL provides hints on the order of execution

In [61]:
from lhcsmapi.pyedsl.QueryBuilder import QueryBuilder

QueryBuilder()
Out[61]:
Set database name using with_pm(), with_cals(ldb), with_nxcals(spark) method.
In [62]:
from lhcsmapi.pyedsl.QueryBuilder import QueryBuilder

QueryBuilder().with_nxcals(spark)
Out[62]:
Database name properly set to NXCALS. Set time definition: for PM signal query, with_timestamp(),
 for PM event query or CALS, NXCALS signal query, with_duration()
In [63]:
from lhcsmapi.pyedsl.QueryBuilder import QueryBuilder

QueryBuilder().with_nxcals(spark).with_duration(t_start='2015-03-13 05:20:59.4910002', duration=[(100, 's'), (100, 's')])
Out[63]:
Query duration set to t_start=1426220359491000200, t_end=1426220559491000200. Set generic query parameter using with_query_parameters() method, or a circuit signal using with_circuit_type() method.
 

At the same time it prohibits unsupported operations throwing a meaningful exception

In [64]:
from lhcsmapi.pyedsl.QueryBuilder import QueryBuilder

QueryBuilder().with_duration(t_start='2015-03-13 05:20:59.4910002', duration=[(100, 's'), (100, 's')])
 
---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-64-17466de21b88> in <module>()
      1 from lhcsmapi.pyedsl.QueryBuilder import QueryBuilder
      2 
----> 3 QueryBuilder().with_duration(t_start='2015-03-13 05:20:59.4910002', duration=[(100, 's'), (100, 's')])

/eos/home-m/mmacieje/SWAN_projects/lhc-sm-api/lhcsmapi/pyedsl/QueryBuilder.py in __getattr__(self, name)
    134     def __getattr__(self, name):
    135         if name not in ['query', 'with_pm', 'with_cals', 'with_nxcals']:
--> 136             raise AttributeError('{}() is not supported. {}'.format(name, self._msg))
    137 
    138     def __repr__(self):

AttributeError: with_duration() is not supported. Set database name using with_pm(), with_cals(ldb), with_nxcals(spark) method.
 

Sections 1.1 - 1.6. deal with: PM event and signal query, CALS signal query, NXCALS signal and feature query (as shown in Table 1). Each case is discussed with a general purpose query, where the user has to provide full data on signal name and its metadata as well as circuit-oriented queries which provide a generic way of querying LHC circuit variables. Both query types are polymorphic and complemented with a set of post-processing functions.

*Since CALS is about to be discontinued at CERN, we do not support this type of query. For feature query with pytimber, please consult https://gitlab.cern.ch/scripting-tools/pytimber

 

1.1. General-Purpose Query - Examples

A sentence constructed this way maintains the differences of query types while providing a common structure

 
  • PM event query
In [65]:
from lhcsmapi.pyedsl.QueryBuilder import QueryBuilder

source_timestamp_df = QueryBuilder().with_pm() \
    .with_duration(t_start='2015-03-13 05:20:59.4910002', duration=[(100, 's'), (100, 's')]) \
    .with_query_parameters(system='FGC', className='51_self_pmd', source='RPTE.UA47.RB.A45') \
    .event_query().df

source_timestamp_df
Out[65]:
  source timestamp
0 RPTE.UA47.RB.A45 1426220469520000000
In [66]:
from lhcsmapi.pyedsl.QueryBuilder import QueryBuilder

source_timestamp_df = QueryBuilder().with_pm() \
    .with_duration(t_start='2015-03-13 05:20:59.4910002', duration=[(24*60*60, 's')]) \
    .with_query_parameters(system='QPS', className='DQAMCNMB_PMHSU', source='*') \
    .event_query().df

source_timestamp_df
Out[66]:
  source timestamp
0 B20L5 1426220469491000000
1 C20L5 1426220517100000000
2 A20L5 1426220518112000000
3 A21L5 1426220625990000000
4 B21L5 1426220866112000000
5 C23L4 1426236802332000000
6 B23L4 1426236839404000000
7 A23L4 1426236839832000000
8 C22L4 1426236949841000000
9 C15R4 1426251285711000000
10 B15R4 1426251337747000000
11 A15R4 1426251388741000000
12 B34L8 1426258716281000000
13 C34L8 1426258747672000000
14 A34L8 1426258747370000000
15 C33L8 1426258835955000000
16 C34R7 1426258853947000000
17 A34R7 1426258854113000000
18 A20R3 1426267931956000000
19 B20R3 1426267983579000000
20 C20R3 1426268004144000000
21 B18L5 1426277626360000000
22 A18L5 1426277679838000000
23 C18L5 1426277680496000000
24 A19L5 1426277903449000000
 
  • PM signal query
In [67]:
from lhcsmapi.pyedsl.QueryBuilder import QueryBuilder

i_meas_df = QueryBuilder().with_pm() \
    .with_timestamp(1426220469520000000) \
    .with_query_parameters(system='FGC', source='RPTE.UA47.RB.A45', className='51_self_pmd', signal='STATUS.I_MEAS') \
    .signal_query().dfs[0]

i_meas_df.plot()
Out[67]:
<matplotlib.axes._subplots.AxesSubplot at 0x7f847dd1d978>
 
 
  • CALS signal query
In [68]:
from lhcsmapi.pyedsl.QueryBuilder import QueryBuilder
import pytimber
ldb = pytimber.LoggingDB()

i_meas_df = QueryBuilder().with_cals(ldb) \
    .with_duration(t_start='2015-03-13 05:20:59.4910002', duration=[(100, 's'), (100, 's')]) \
    .with_query_parameters(signal='RPTE.UA47.RB.A45:I_MEAS') \
    .signal_query().dfs[0]

i_meas_df.plot()
 
WARNING:cmmnbuild_dep_manager:JVM is already started
 
 
 
 
Out[68]:
<matplotlib.axes._subplots.AxesSubplot at 0x7f847dcbbeb8>
 
 
  • NXCALS signal query - device, property
In [69]:
from lhcsmapi.pyedsl.QueryBuilder import QueryBuilder

i_meas_df = QueryBuilder().with_nxcals(spark) \
    .with_duration(t_start='2015-03-13 05:20:59.4910002', duration=[(100, 's'), (100, 's')]) \
    .with_query_parameters(nxcals_system='CMW', nxcals_device='RPTE.UA47.RB.A45', nxcals_property='SUB', signal='I_MEAS') \
    .signal_query().dfs[0]

i_meas_df.plot()
Out[69]:
<matplotlib.axes._subplots.AxesSubplot at 0x7f8474c97400>
 
 
  • NXCALS signal query - variable
In [70]:
from lhcsmapi.pyedsl.QueryBuilder import QueryBuilder

u_mag_df = QueryBuilder().with_nxcals(spark) \
    .with_duration(t_start='2015-03-13 05:20:59.4910002', duration=[(100, 's'), (100, 's')]) \
    .with_query_parameters(nxcals_system='WINCCOA', signal='DCBA.15R4.R:U_MAG') \
    .signal_query().dfs[0]

u_mag_df.plot()
Out[70]:
<matplotlib.axes._subplots.AxesSubplot at 0x7f847c00a9b0>
 
In [71]:
bmode_df = QueryBuilder().with_nxcals(spark) \
    .with_duration(t_start='2015-03-13 05:20:59.4910002', t_end='2015-04-13 05:20:59.4910002') \
    .with_query_parameters(nxcals_system='CMW', signal='HX:BMODE') \
    .signal_query().dfs[0]

bmode_df.head()
Out[71]:
  HX:BMODE
nxcals_timestamp  
1426513864882000000 SETUP
1426515551191000000 NOBEAM
1426597426610000000 SETUP
1426597576404000000 STABLE
1426598591040000000 UNSTABLE
 
  • NXCALS feature query - device, property

I_MEAS from beam injection, through beam acceleration, to stable beams

In [72]:
from lhcsmapi.pyedsl.QueryBuilder import QueryBuilder

i_meas_df = QueryBuilder().with_nxcals(spark) \
    .with_duration(t_start=1526898157236000000, t_end=1526903352338000000) \
    .with_query_parameters(nxcals_system='CMW', nxcals_device='RPTE.UA23.RB.A12', nxcals_property='SUB', signal='I_MEAS') \
    .feature_query(['mean', 'std', 'max', 'min', 'count']).df

i_meas_df
Out[72]:
  device mean std min count max
0 RPTE.UA23.RB.A12 5374.768659 3315.972856 757.18 2498 10978.8
 
  • NXCALS feature query - variable

U_MAG from beam injection, through beam acceleration, to stable beams

In [73]:
from lhcsmapi.pyedsl.QueryBuilder import QueryBuilder

u_mag_df = QueryBuilder().with_nxcals(spark) \
    .with_duration(t_start=1526898157236000000, t_end=1526903352338000000) \
    .with_query_parameters(nxcals_system='WINCCOA', signal='DCBB.8L2.R:U_MAG') \
    .feature_query(['mean', 'std', 'max', 'min', 'count']).df

u_mag_df
Out[73]:
  nxcals_variable_name mean std min count max
0 DCBB.8L2.R:U_MAG -0.196706 0.375195 -0.975332 51951 0.002113
 

1.2. General-Purpose Query - Polymorphism

  • multiple signal, mutliple sources, multiple systems, multiple className
In [74]:
from lhcsmapi.pyedsl.QueryBuilder import QueryBuilder

i_meas_df = QueryBuilder().with_pm() \
    .with_timestamp(1426220469520000000) \
    .with_query_parameters(system=['FGC', 'FGC'], source=['RPTE.UA47.RB.A45', 'RPTE.UA47.RB.A45'], 
                           className=['51_self_pmd', '51_self_pmd'], signal=['STATUS.I_MEAS', 'STATUS.I_REF']) \
    .signal_query().dfs

ax=i_meas_df[0].plot()
i_meas_df[1].plot(ax=ax)
Out[74]:
<matplotlib.axes._subplots.AxesSubplot at 0x7f8474d73b00>
 
 
  • multiple signal, single source, single system, single className
In [75]:
from lhcsmapi.pyedsl.QueryBuilder import QueryBuilder

i_meas_df = QueryBuilder().with_pm() \
    .with_timestamp(1426220469520000000) \
    .with_query_parameters(system='FGC', source='RPTE.UA47.RB.A45', 
                           className='51_self_pmd', signal=['STATUS.I_MEAS', 'STATUS.I_REF']) \
    .signal_query().dfs

ax=i_meas_df[0].plot()
i_meas_df[1].plot(ax=ax)
Out[75]:
<matplotlib.axes._subplots.AxesSubplot at 0x7f8474d9a400>
 
 

1.3. Circuit-Oriented Query - Examples

  • PM event query
In [76]:
from lhcsmapi.pyedsl.QueryBuilder import QueryBuilder

source_timestamp_df = QueryBuilder().with_pm() \
    .with_duration(t_start='2015-03-13 05:20:59.4910002', duration=[(100, 's'), (100, 's')]) \
    .with_circuit_type('RB') \
    .with_metadata(circuit_name='RB.A45', system='PC') \
    .event_query().df

source_timestamp_df
Out[76]:
  source timestamp
0 RPTE.UA47.RB.A45 1426220469520000000
In [77]:
from lhcsmapi.pyedsl.QueryBuilder import QueryBuilder

source_timestamp_df = QueryBuilder().with_pm() \
    .with_duration(t_start='2015-03-13 05:20:59.4910002', duration=[(24*60*60, 's')]) \
    .with_circuit_type('RB') \
    .with_metadata(circuit_name='RB.A45', system='QH', source='*') \
    .event_query().df

source_timestamp_df
Out[77]:
  source timestamp
0 B20L5 1426220469491000000
1 C20L5 1426220517100000000
2 A20L5 1426220518112000000
3 A21L5 1426220625990000000
4 B21L5 1426220866112000000
5 C23L4 1426236802332000000
6 B23L4 1426236839404000000
7 A23L4 1426236839832000000
8 C22L4 1426236949841000000
9 C15R4 1426251285711000000
10 B15R4 1426251337747000000
11 A15R4 1426251388741000000
12 B34L8 1426258716281000000
13 C34L8 1426258747672000000
14 A34L8 1426258747370000000
15 C33L8 1426258835955000000
16 C34R7 1426258853947000000
17 A34R7 1426258854113000000
18 A20R3 1426267931956000000
19 B20R3 1426267983579000000
20 C20R3 1426268004144000000
21 B18L5 1426277626360000000
22 A18L5 1426277679838000000
23 C18L5 1426277680496000000
24 A19L5 1426277903449000000
 
  • PM signal query
In [78]:
from lhcsmapi.pyedsl.QueryBuilder import QueryBuilder

i_meas_df = QueryBuilder().with_pm() \
    .with_timestamp(1426220469520000000) \
    .with_circuit_type('RB') \
    .with_metadata(circuit_name='RB.A45', system='PC', signal='I_MEAS') \
    .signal_query().dfs[0]

i_meas_df.plot()
Out[78]:
<matplotlib.axes._subplots.AxesSubplot at 0x7f8474c56a58>
 
 
  • CALS signal query
In [79]:
from lhcsmapi.pyedsl.QueryBuilder import QueryBuilder
import pytimber
ldb = pytimber.LoggingDB()

i_meas_df = QueryBuilder().with_cals(ldb) \
    .with_duration(t_start='2015-03-13 05:20:59.4910002', duration=[(100, 's'), (100, 's')]) \
    .with_circuit_type('RB') \
    .with_metadata(circuit_name='RB.A45', system='PC', signal='I_MEAS')\
    .signal_query().dfs[0]

i_meas_df.plot()
 
WARNING:cmmnbuild_dep_manager:JVM is already started
 
 
 
 
Out[79]:
<matplotlib.axes._subplots.AxesSubplot at 0x7f8474b969e8>
 
 
  • NXCALS signal query - device, property
In [80]:
from lhcsmapi.pyedsl.QueryBuilder import QueryBuilder

i_meas_df = QueryBuilder().with_nxcals(spark) \
    .with_duration(t_start='2015-03-13 05:20:59.4910002', duration=[(100, 's'), (100, 's')]) \
    .with_circuit_type('RB') \
    .with_metadata(circuit_name='RB.A45', system='PC', signal='I_MEAS') \
    .signal_query().dfs[0]

i_meas_df.plot()
Out[80]:
<matplotlib.axes._subplots.AxesSubplot at 0x7f8474b497f0>
 
 
  • NXCALS signal query - variable
In [81]:
from lhcsmapi.pyedsl.QueryBuilder import QueryBuilder

u_mag_df = QueryBuilder().with_nxcals(spark) \
    .with_duration(t_start='2015-03-13 05:20:59.4910002', duration=[(100, 's'), (100, 's')]) \
    .with_circuit_type('RB') \
    .with_metadata(circuit_name='RB.A45', system='BUSBAR', signal='U_MAG', wildcard={'BUSBAR': 'DCBA.15R4.R'}) \
    .signal_query().dfs[0]

u_mag_df.plot()
Out[81]:
<matplotlib.axes._subplots.AxesSubplot at 0x7f8474b19a90>
 
 
  • NXCALS feature query - device, property

I_MEAS from beam injection, through beam acceleration, to stable beams

In [82]:
from lhcsmapi.pyedsl.QueryBuilder import QueryBuilder

i_meas_df = QueryBuilder().with_nxcals(spark) \
    .with_duration(t_start=1526898157236000000, t_end=1526903352338000000) \
    .with_circuit_type('RB') \
    .with_metadata(circuit_name='RB.A12', system='PC', signal='I_MEAS') \
    .feature_query(['mean', 'std', 'max', 'min', 'count']).df

i_meas_df
Out[82]:
  device mean std min count max
0 RPTE.UA23.RB.A12 5374.768659 3315.972856 757.18 2498 10978.8
 
  • NXCALS feature query - variable

U_MAG from beam injection, through beam acceleration, to stable beams

In [83]:
from lhcsmapi.pyedsl.QueryBuilder import QueryBuilder

u_mag_df = QueryBuilder().with_nxcals(spark) \
    .with_duration(t_start=1526898157236000000, t_end=1526903352338000000) \
    .with_circuit_type('RB') \
    .with_metadata(circuit_name='RB.A12', system='BUSBAR', signal='U_MAG', wildcard={'BUSBAR': 'DCBB.8L2.R'}) \
    .feature_query(['mean', 'std', 'max', 'min', 'count']).df

u_mag_df
Out[83]:
  nxcals_variable_name mean std min count max
0 DCBB.8L2.R:U_MAG -0.196706 0.375195 -0.975332 51951 0.002113
 

1.4. Circuit-Oriented Query - Polymorphism

  • Multiple circuit names
In [84]:
from lhcsmapi.pyedsl.QueryBuilder import QueryBuilder

i_meas_dfs = QueryBuilder().with_nxcals(spark) \
    .with_duration(t_start='2015-03-13 05:20:59.4910002', duration=[(100, 's'), (100, 's')]) \
    .with_circuit_type('RB') \
    .with_metadata(circuit_name=['RB.A12', 'RB.A45'], system='PC', signal='I_MEAS')\
    .signal_query().dfs

ax = i_meas_dfs[0].plot()
i_meas_dfs[1].plot(ax=ax, grid=True)
Out[84]:
<matplotlib.axes._subplots.AxesSubplot at 0x7f8474aa5ba8>
 
 
  • Multiple system names
In [85]:
from lhcsmapi.pyedsl.QueryBuilder import QueryBuilder

u_hts_dfs = QueryBuilder().with_nxcals(spark) \
    .with_duration(t_start=1544622149598000000, duration=[(50, 's'), (150, 's')]) \
    .with_circuit_type('RB') \
    .with_metadata(circuit_name='RB.A12', system=['LEADS_EVEN', 'LEADS_ODD'], signal='U_HTS') \
    .signal_query().dfs

ax = u_hts_dfs[0].plot()
u_hts_dfs[1].plot(ax=ax)
Out[85]:
<matplotlib.axes._subplots.AxesSubplot at 0x7f8474aa3748>
 
 
  • Multiple signal names
In [86]:
from lhcsmapi.pyedsl.QueryBuilder import QueryBuilder

u_ext_dfs = QueryBuilder().with_pm() \
    .with_timestamp(1544622149598000000) \
    .with_circuit_type('RQ') \
    .with_metadata(circuit_name='RQD.A12', system='QDS', signal=['U_1_EXT', 'U_2_EXT'],
                        source='16L2', wildcard={'CELL': '16L2'}) \
    .signal_query().dfs

ax = u_ext_dfs[0].plot()
u_ext_dfs[1].plot(ax=ax)
Out[86]:
<matplotlib.axes._subplots.AxesSubplot at 0x7f8474be5ba8>
 
 
  • Signal wildcard
In [87]:
from lhcsmapi.pyedsl.QueryBuilder import QueryBuilder
import matplotlib.pyplot as plt
import pytimber
ldb = pytimber.LoggingDB()

u_diode_rqd_dfs = QueryBuilder().with_cals(ldb) \
    .with_duration(t_start=int(1544622149613000000), duration=[(50, 's'), (150, 's')]) \
    .with_circuit_type('RQ') \
    .with_metadata(circuit_name='RQD.A12', system='DIODE_RQD', signal='U_DIODE_RQD', wildcard={'MAGNET': '*'})\
    .signal_query().dfs

# plot all
fig, ax = plt.subplots()
for u_diode_rqd_df in u_diode_rqd_dfs:
    u_diode_rqd_df.plot(ax=ax)
ax.legend().set_visible(False)
 
WARNING:cmmnbuild_dep_manager:JVM is already started
 
 
 
 
 
 

1.5. Advanced Feature Query

NXCALS enables calculation of signal features such as min, max, mean, std, count directly on the cluster without the need for costly query of the signal and performing calculation locally. This approach enables parallel computing on the cluster. To this end, a query should contain an element enabling a group by operation. Each group by operation allows for executing computation in parallel. For the sake of compactness, we only show examples for circuit-oriented query, however, the same principle applies to the general-purpose queries.

  • Feature query of multiple signals for the same period of time
In [88]:
from lhcsmapi.pyedsl.QueryBuilder import QueryBuilder

i_meas_df = QueryBuilder().with_nxcals(spark) \
    .with_duration(t_start=1526898157236000000, t_end=1526903352338000000) \
    .with_circuit_type('RB') \
    .with_metadata(circuit_name='*', system='PC', signal='I_MEAS') \
    .feature_query(['mean', 'std', 'max', 'min', 'count']).df

i_meas_df
Out[88]:
  device mean std min count max
0 RPTE.UA63.RB.A56 5372.636369 3314.646638 756.90 2498 10974.36
1 RPTE.UA87.RB.A81 5371.596990 3314.230379 756.82 2498 10973.21
2 RPTE.UA83.RB.A78 5369.226934 3312.676833 756.47 2498 10967.94
3 RPTE.UA27.RB.A23 5373.973151 3315.324716 757.04 2498 10976.46
4 RPTE.UA43.RB.A34 5373.559131 3314.074552 756.95 2497 10975.16
5 RPTE.UA67.RB.A67 5374.329179 3315.855302 757.18 2498 10978.56
6 RPTE.UA47.RB.A45 5370.954557 3312.837184 756.71 2497 10971.44
7 RPTE.UA23.RB.A12 5374.768659 3315.972856 757.18 2498 10978.80
 
  • Feature query of multiple signals with the same period of time subdivided into three intervals - group by signal name and interval
In [89]:
t_start_injs = 1526898157236000000
t_end_injs = 1526899957236000000
t_start_sbs = 1526901552338000000
t_end_sbs = 1526903352338000000
In [90]:
from lhcsmapi.pyedsl.QueryBuilder import QueryBuilder

i_meas_df = QueryBuilder().with_nxcals(spark) \
    .with_duration(t_start=t_start_injs, t_end=t_end_sbs) \
    .with_circuit_type('RB') \
    .with_metadata(circuit_name='RB.A12', system='PC', signal='I_MEAS')\
    .signal_query().dfs[0]

ax = i_meas_df.plot(figsize=(10,5), linewidth=5)
ax.axvspan(xmin=t_start_injs, xmax=t_end_injs, facecolor='xkcd:goldenrod')
ax.axvspan(xmin=t_end_injs, xmax=t_start_sbs, facecolor='xkcd:grey')
ax.axvspan(xmin=t_start_sbs, xmax=t_end_sbs, facecolor='xkcd:green')
Out[90]:
<matplotlib.patches.Polygon at 0x7f846ff09e10>
 
 

Function translate introduces a mapping based on the time column. Here, we consider three subintervals for beam injection, beam acceleration, and stable beams.

As a result, the time column forms a partition and can be executed in parallel.

In [91]:
from pyspark.sql.functions import udf
from pyspark.sql.types import IntegerType

def translate(timestamp):
    if(timestamp >= t_start_injs and timestamp < t_end_injs):
        return 1
    
    if(timestamp >=  t_end_injs and timestamp < t_start_sbs):
        return 2
    
    if(timestamp >= t_start_sbs and timestamp <= t_end_sbs):
        return 3
    
    return -1

translate_udf = udf(translate, IntegerType())
 

The translate function should be passed as a function argument

In [92]:
i_meas_df = QueryBuilder().with_nxcals(spark) \
    .with_duration(t_start=t_start_injs, t_end=t_end_sbs) \
    .with_circuit_type('RB') \
    .with_metadata(circuit_name='*', system='PC', signal='I_MEAS') \
    .feature_query(['mean', 'std', 'max', 'min', 'count'], function=translate_udf).df

i_meas_df
Out[92]:
  device class mean std min count max
0 RPTE.UA43.RB.A34 1 756.959833 0.001291 756.95 60 756.96
1 RPTE.UA63.RB.A56 3 10974.350167 0.001291 10974.35 60 10974.36
2 RPTE.UA47.RB.A45 3 10971.439667 0.001810 10971.43 60 10971.44
3 RPTE.UA47.RB.A45 2 5346.059971 3193.562789 756.71 2377 10970.89
4 RPTE.UA23.RB.A12 3 10978.781500 0.007552 10978.77 60 10978.80
5 RPTE.UA83.RB.A78 2 5344.349941 3193.572017 756.47 2378 10967.54
6 RPTE.UA43.RB.A34 2 5348.695949 3194.776339 756.95 2377 10974.68
7 RPTE.UA67.RB.A67 1 757.188833 0.003237 757.18 60 757.19
8 RPTE.UA87.RB.A81 3 10973.209167 0.002787 10973.20 60 10973.21
9 RPTE.UA27.RB.A23 2 5349.106434 3196.138736 757.04 2378 10976.10
10 RPTE.UA83.RB.A78 3 10967.940000 0.000000 10967.94 60 10967.94
11 RPTE.UA47.RB.A45 1 756.710000 0.000000 756.71 60 756.71
12 RPTE.UA83.RB.A78 1 756.472000 0.004034 756.47 60 756.48
13 RPTE.UA27.RB.A23 1 757.046833 0.004691 757.04 60 757.05
14 RPTE.UA87.RB.A81 2 5346.697637 3195.063002 756.83 2378 10972.79
15 RPTE.UA87.RB.A81 1 756.829167 0.002787 756.82 60 756.83
16 RPTE.UA23.RB.A12 2 5349.879361 3196.753501 757.19 2378 10978.41
17 RPTE.UA67.RB.A67 2 5349.423865 3196.631627 757.18 2378 10978.15
18 RPTE.UA63.RB.A56 2 5347.758797 3195.477077 756.90 2378 10973.97
19 RPTE.UA43.RB.A34 3 10975.154833 0.005039 10975.15 60 10975.16
20 RPTE.UA23.RB.A12 1 757.201667 0.007847 757.18 60 757.22
21 RPTE.UA27.RB.A23 3 10976.450333 0.001810 10976.45 60 10976.46
22 RPTE.UA63.RB.A56 1 756.903667 0.004860 756.90 60 756.91
23 RPTE.UA67.RB.A67 3 10978.550167 0.001291 10978.55 60 10978.56
 

The same method applied to NXCALS signals based on variable

In [93]:
from lhcsmapi.pyedsl.QueryBuilder import QueryBuilder

u_mag_ab_df = QueryBuilder().with_nxcals(spark) \
    .with_duration(t_start=t_start_injs, t_end=t_end_sbs) \
    .with_circuit_type('RB') \
    .with_metadata(circuit_name='RB.A12', system='BUSBAR', signal='U_MAG', wildcard={'BUSBAR': 'DCBB.8L2.R'}) \
    .feature_query(['mean', 'std', 'max', 'min', 'count'], function=translate_udf).df

u_mag_ab_df
Out[93]:
  nxcals_variable_name class mean std min count max
0 DCBB.8L2.R:U_MAG 3 -0.001983 0.002882 -0.034035 18000 0.002003
1 DCBB.8L2.R:U_MAG 1 -0.001861 0.002844 -0.005992 18000 0.002113
2 DCBB.8L2.R:U_MAG 2 -0.636315 0.423767 -0.975332 15951 0.002041
 

This method can be used together with signal decimation, i.e., taking every nth sample.

For example this can be useful to query QPS board A and B which share the same channel and samples are shifted by 5 so that

  • every 10-th sample belongs to board A (or B), decimation=10
In [94]:
from lhcsmapi.pyedsl.QueryBuilder import QueryBuilder

u_mag_a_df = QueryBuilder().with_nxcals(spark) \
    .with_duration(t_start=t_start_injs, t_end=t_end_sbs) \
    .with_circuit_type('RB') \
    .with_metadata(circuit_name='RB.A12', system='BUSBAR', signal='U_MAG', wildcard={'BUSBAR': 'DCBB.8L2.R'}) \
    .feature_query(['mean', 'std', 'max', 'min', 'count'], function=translate_udf, decimation=10).df

u_mag_a_df
Out[94]:
  nxcals_variable_name class mean std min count max
0 DCBB.8L2.R:U_MAG 3 0.000812 0.000529 -0.012705 1800 0.002003
1 DCBB.8L2.R:U_MAG 1 0.000965 0.000434 -0.000463 1800 0.002113
2 DCBB.8L2.R:U_MAG 2 -0.633601 0.423847 -0.969405 1595 0.002041
 
  • every 5+10-th sample belongs to board B (or A), decimation=10, shift=5
In [95]:
from lhcsmapi.pyedsl.QueryBuilder import QueryBuilder

u_mag_b_df = QueryBuilder().with_nxcals(spark) \
    .with_duration(t_start=t_start_injs, t_end=t_end_sbs) \
    .with_circuit_type('RB') \
    .with_metadata(circuit_name='RB.A12', system='BUSBAR', signal='U_MAG', wildcard={'BUSBAR': 'DCBB.8L2.R'}) \
    .feature_query(['mean', 'std', 'max', 'min', 'count'], function=translate_udf, decimation=10, shift=5).df

u_mag_b_df
Out[95]:
  nxcals_variable_name class mean std min count max
0 DCBB.8L2.R:U_MAG 3 -0.004754 0.000877 -0.034035 1800 -0.003425
1 DCBB.8L2.R:U_MAG 1 -0.004656 0.000441 -0.005992 1800 -0.003361
2 DCBB.8L2.R:U_MAG 2 -0.639233 0.423941 -0.975332 1595 -0.003052
 
  • with polymorphism one can query 1248 busbar at once (in two batches of 624 due to the limit of 1000 signal per query)
In [96]:
from lhcsmapi.pyedsl.QueryBuilder import QueryBuilder

u_mag_ab_df = QueryBuilder().with_nxcals(spark) \
    .with_duration(t_start=t_start_injs, t_end=t_end_sbs) \
    .with_circuit_type('RB') \
    .with_metadata(circuit_name='*', system='BUSBAR', signal='U_MAG', wildcard={'BUSBAR': '*'}) \
    .feature_query(['mean', 'std', 'max', 'min', 'count'], function=translate_udf).df

u_mag_ab_df.head()
Out[96]:
  nxcals_variable_name class mean std min count max
0 DCBA.B34R3.L:U_MAG 2 0.634813 0.423418 0.000184 15951 0.970042
1 DCBA.22L4.L:U_MAG 1 -0.001110 0.000639 -0.002102 18000 -0.000167
2 DCBB.11L2.R:U_MAG 3 0.001457 0.002587 -0.002121 18000 0.048651
3 DCBA.B33L5.L:U_MAG 2 0.634651 0.423401 -0.001109 15951 0.971211
4 DCBA.B13L5.L:U_MAG 2 0.632462 0.423240 -0.002246 15951 0.968428
 

1.6. Processing Raw Signals

Once a signal is queried, one can perform some operations on each of them.
In this case, the order of operations does not matter (but can be checked).

Signal query Signal processing
{DB}.{DURATION}.{QUERY_PARAMETERS}.{QUERY}  
{DB}.{DURATION}.{CIRCUIT_TYPE}.{METADATA}.{QUERY}  
  .synchronize_time()
  .convert_index_to_sec()
  .create_col_from_index()
  .filter_median()
  .remove_values_for_time_less_than()
  .remove_initial_offset()
In [97]:
from lhcsmapi.pyedsl.QueryBuilder import QueryBuilder

i_meas_df = QueryBuilder().with_nxcals(spark) \
    .with_duration(t_start='2015-01-13 16:59:11+01:00', t_end='2015-01-13 17:15:46+01:00') \
    .with_circuit_type('RB') \
    .with_metadata(circuit_name='RB.A12', system='PC', signal='I_MEAS') \
    .signal_query() \
    .synchronize_time() \
    .convert_index_to_sec().dfs[0]

i_meas_df.plot()
Out[97]:
<matplotlib.axes._subplots.AxesSubplot at 0x7f847413d908>
 
 

1.7. Processing Raw Events

For PM event queries one can perform several operations on source, timestamp dataframe.

Event query Event processing
{DB}.{DURATION}.{QUERY_PARAMETERS}.{QUERY}  
{DB}.{DURATION}.{CIRCUIT_TYPE}.{METADATA}.{QUERY}  
  .filter_source()
  .drop_duplicate_source()
  .sort_values()

The processing methods are dedicated to performing repeated operations on PM events. In case of searching a given system and className with wildcard '*' as a source, the event query can return events from different sectors. In this case, one can filter events to contain to a given sector. Some PM systems return duplicate events from different types of boards. In this case one can drop duplicate sources. Eventually, the events can be sorted by either source or timestamp.

 
  • Filter source
In [98]:
from lhcsmapi.pyedsl.QueryBuilder import QueryBuilder

source_timestamp_df = QueryBuilder().with_pm() \
    .with_duration(t_start='2015-03-13 05:20:59.4910002', duration=[(24*60*60, 's')]) \
    .with_circuit_type('RB') \
    .with_metadata(circuit_name='RB.A45', system='QH', source='*') \
    .event_query() \
    .df

source_timestamp_df
Out[98]:
  source timestamp
0 B20L5 1426220469491000000
1 C20L5 1426220517100000000
2 A20L5 1426220518112000000
3 A21L5 1426220625990000000
4 B21L5 1426220866112000000
5 C23L4 1426236802332000000
6 B23L4 1426236839404000000
7 A23L4 1426236839832000000
8 C22L4 1426236949841000000
9 C15R4 1426251285711000000
10 B15R4 1426251337747000000
11 A15R4 1426251388741000000
12 B34L8 1426258716281000000
13 C34L8 1426258747672000000
14 A34L8 1426258747370000000
15 C33L8 1426258835955000000
16 C34R7 1426258853947000000
17 A34R7 1426258854113000000
18 A20R3 1426267931956000000
19 B20R3 1426267983579000000
20 C20R3 1426268004144000000
21 B18L5 1426277626360000000
22 A18L5 1426277679838000000
23 C18L5 1426277680496000000
24 A19L5 1426277903449000000
 

Executing filter_source() with circuit name and system type would filter out events not belonging to a given circuit name

In [99]:
from lhcsmapi.pyedsl.QueryBuilder import QueryBuilder

source_timestamp_df = QueryBuilder().with_pm() \
    .with_duration(t_start='2015-03-13 05:20:59.4910002', duration=[(24*60*60, 's')]) \
    .with_circuit_type('RB') \
    .with_metadata(circuit_name='RB.A45', system='QH', source='*') \
    .event_query() \
    .filter_source('RB.A45', 'QH') \
    .df

source_timestamp_df
Out[99]:
  source timestamp
0 B20L5 1426220469491000000
1 C20L5 1426220517100000000
2 A20L5 1426220518112000000
3 A21L5 1426220625990000000
4 B21L5 1426220866112000000
5 C15R4 1426251285711000000
6 B15R4 1426251337747000000
7 A15R4 1426251388741000000
8 B18L5 1426277626360000000
9 A18L5 1426277679838000000
10 C18L5 1426277680496000000
11 A19L5 1426277903449000000
 
  • Drop duplicates
In [100]:
from lhcsmapi.pyedsl.QueryBuilder import QueryBuilder

source_timestamp_df = QueryBuilder().with_pm() \
    .with_duration(t_start='2015-03-13 05:20:59.4910002', duration=[(24*60*60, 's')]) \
    .with_circuit_type('RB') \
    .with_metadata(circuit_name='RB.A45', system='QDS', source='*') \
    .event_query() \
    .df

source_timestamp_df
Out[100]:
  source timestamp
0 B20L5 1426220469490000000
1 C20L5 1426220517099000000
2 A20L5 1426220518111000000
3 A21L5 1426220625989000000
4 B21L5 1426220866111000000
5 B20L5 1426220469492000000
6 C20L5 1426220517101000000
7 A20L5 1426220518113000000
8 A21L5 1426220625991000000
9 B21L5 1426220866113000000
10 C23L4 1426236802331000000
11 B23L4 1426236839403000000
12 A23L4 1426236839831000000
13 C22L4 1426236949840000000
14 C23L4 1426236802333000000
15 B23L4 1426236839405000000
16 A23L4 1426236839833000000
17 C22L4 1426236949842000000
18 C15R4 1426251285710000000
19 B15R4 1426251337746000000
20 A15R4 1426251388740000000
21 C15R4 1426251285712000000
22 B15R4 1426251337748000000
23 A15R4 1426251388742000000
24 B34L8 1426258716280000000
25 C34L8 1426258747671000000
26 A34L8 1426258747369000000
27 C33L8 1426258835954000000
28 C34R7 1426258853946000000
29 A34R7 1426258854112000000
30 B34L8 1426258716282000000
31 C34L8 1426258747673000000
32 A34L8 1426258747371000000
33 C33L8 1426258835956000000
34 C34R7 1426258853948000000
35 A34R7 1426258854114000000
36 A20R3 1426267931955000000
37 B20R3 1426267983578000000
38 C20R3 1426268004143000000
39 A20R3 1426267931957000000
40 B20R3 1426267983580000000
41 C20R3 1426268004145000000
42 B18L5 1426277626359000000
43 A18L5 1426277679837000000
44 C18L5 1426277680495000000
45 A19L5 1426277903448000000
46 B18L5 1426277626361000000
47 A18L5 1426277679839000000
48 C18L5 1426277680497000000
49 A19L5 1426277903450000000
 

Executing filter_source() with circuit name and system type would filter out events not belonging to a given circuit name

In [101]:
from lhcsmapi.pyedsl.QueryBuilder import QueryBuilder

source_timestamp_df = QueryBuilder().with_pm() \
    .with_duration(t_start='2015-03-13 05:20:59.4910002', duration=[(24*60*60, 's')]) \
    .with_circuit_type('RB') \
    .with_metadata(circuit_name='RB.A45', system='QDS', source='*') \
    .event_query() \
    .filter_source('RB.A45', 'QDS') \
    .df

source_timestamp_df
Out[101]:
  source timestamp
0 B20L5 1426220469490000000
1 C20L5 1426220517099000000
2 A20L5 1426220518111000000
3 A21L5 1426220625989000000
4 B21L5 1426220866111000000
5 B20L5 1426220469492000000
6 C20L5 1426220517101000000
7 A20L5 1426220518113000000
8 A21L5 1426220625991000000
9 B21L5 1426220866113000000
10 C15R4 1426251285710000000
11 B15R4 1426251337746000000
12 A15R4 1426251388740000000
13 C15R4 1426251285712000000
14 B15R4 1426251337748000000
15 A15R4 1426251388742000000
16 B18L5 1426277626359000000
17 A18L5 1426277679837000000
18 C18L5 1426277680495000000
19 A19L5 1426277903448000000
20 B18L5 1426277626361000000
21 A18L5 1426277679839000000
22 C18L5 1426277680497000000
23 A19L5 1426277903450000000
 
  • drop_duplicate_source()

Some PM systems return duplicate events from different types of boards. In this case one can drop duplicate sources.

In [102]:
from lhcsmapi.pyedsl.QueryBuilder import QueryBuilder

source_timestamp_df = QueryBuilder().with_pm() \
    .with_duration(t_start='2015-03-13 05:20:59.4910002', duration=[(24*60*60, 's')]) \
    .with_circuit_type('RB') \
    .with_metadata(circuit_name='RB.A45', system='QDS', source='*') \
    .event_query() \
    .filter_source('RB.A45', 'QDS') \
    .drop_duplicate_source() \
    .df

source_timestamp_df
Out[102]:
  source timestamp
0 B20L5 1426220469490000000
1 C20L5 1426220517099000000
2 A20L5 1426220518111000000
3 A21L5 1426220625989000000
4 B21L5 1426220866111000000
5 C15R4 1426251285710000000
6 B15R4 1426251337746000000
7 A15R4 1426251388740000000
8 B18L5 1426277626359000000
9 A18L5 1426277679837000000
10 C18L5 1426277680495000000
11 A19L5 1426277903448000000
 
  • sort_values()

The events can be sorted by either source or timestamp.

In [103]:
from lhcsmapi.pyedsl.QueryBuilder import QueryBuilder

source_timestamp_df = QueryBuilder().with_pm() \
    .with_duration(t_start='2015-03-13 05:20:59.4910002', duration=[(24*60*60, 's')]) \
    .with_circuit_type('RB') \
    .with_metadata(circuit_name='RB.A45', system='QDS', source='*') \
    .event_query() \
    .filter_source('RB.A45', 'QDS') \
    .drop_duplicate_source() \
    .sort_values(by='timestamp') \
    .df

source_timestamp_df
Out[103]:
  source timestamp
0 B20L5 1426220469490000000
1 C20L5 1426220517099000000
2 A20L5 1426220518111000000
3 A21L5 1426220625989000000
4 B21L5 1426220866111000000
5 C15R4 1426251285710000000
6 B15R4 1426251337746000000
7 A15R4 1426251388740000000
8 B18L5 1426277626359000000
9 A18L5 1426277679837000000
10 C18L5 1426277680495000000
11 A19L5 1426277903448000000
 

2. AssertionBuilder()

 

{SIGNALS}.(TIME_RANGE).{ASSERTION}

 

Signal input Time range definition (optional) / Signal assertion Signal assertions (if time range defined)
.with_signal()    
  .has_min_max_value()  
  .compare_to_reference()  
  .with_time_range() .has_min_max_variation()
  .with_time_range() .has_min_max_slope()
 
  • has_min_max_value()
In [104]:
from lhcsmapi.pyedsl.QueryBuilder import QueryBuilder
from lhcsmapi.pyedsl.AssertionBuilder import AssertionBuilder

tt891a_dfs = QueryBuilder().with_nxcals(spark) \
    .with_duration(t_start='2014-12-13 09:12:41+01:00', t_end='2014-12-13 12:27:11+01:00') \
    .with_circuit_type('RB') \
    .with_metadata(circuit_name='RB.A12', system=['LEADS_EVEN', 'LEADS_ODD'], signal='TT891A') \
    .signal_query() \
    .synchronize_time() \
    .convert_index_to_sec() \
    .filter_median().dfs


AssertionBuilder().with_signal(tt891a_dfs) \
    .has_min_max_value(value_min=46, value_max=54)
Out[104]:
<lhcsmapi.pyedsl.AssertionBuilder.AssertionBuilderSignalPlot at 0x7f8474826c18>
 
 

In the case below, the assertion should fail and raise a warning.

In [105]:
AssertionBuilder().with_signal(tt891a_dfs) \
    .has_min_max_value(value_min=50, value_max=54)
 
/eos/home-m/mmacieje/SWAN_projects/lhc-sm-api/lhcsmapi/pyedsl/AssertionBuilder.py:114: UserWarning: DACA05_07L2_TT891A.TEMPERATURECALC outside of the [50, 54] mV threshold
  warnings.warn('{} outside of the [{}, {}] mV threshold'.format(col, value_min, value_max))
/eos/home-m/mmacieje/SWAN_projects/lhc-sm-api/lhcsmapi/pyedsl/AssertionBuilder.py:114: UserWarning: DACA06_07L2_TT891A.TEMPERATURECALC outside of the [50, 54] mV threshold
  warnings.warn('{} outside of the [{}, {}] mV threshold'.format(col, value_min, value_max))
/eos/home-m/mmacieje/SWAN_projects/lhc-sm-api/lhcsmapi/pyedsl/AssertionBuilder.py:114: UserWarning: DABA01_07R1_TT891A.TEMPERATURECALC outside of the [50, 54] mV threshold
  warnings.warn('{} outside of the [{}, {}] mV threshold'.format(col, value_min, value_max))
/eos/home-m/mmacieje/SWAN_projects/lhc-sm-api/lhcsmapi/pyedsl/AssertionBuilder.py:114: UserWarning: DABA02_07R1_TT891A.TEMPERATURECALC outside of the [50, 54] mV threshold
  warnings.warn('{} outside of the [{}, {}] mV threshold'.format(col, value_min, value_max))
Out[105]:
<lhcsmapi.pyedsl.AssertionBuilder.AssertionBuilderSignalPlot at 0x7f84748a6c88>
 
 
  • compare_to_reference()
In [106]:
from lhcsmapi.pyedsl.QueryBuilder import QueryBuilder
from lhcsmapi.pyedsl.AssertionBuilder import AssertionBuilder
from lhcsmapi.reference.Reference import Reference
from lhcsmapi.Time import Time

timestamp_ee_rqd = 1544622149701000000
timestamp_fgc_rqd = 1544622149620000000
signal_names = 'T_RES'


t_res_df = QueryBuilder().with_pm() \
    .with_timestamp(timestamp_ee_rqd) \
    .with_circuit_type('RQ') \
    .with_metadata(circuit_name='RQD.A12', system='EE', signal=signal_names).signal_query() \
    .remove_values_for_time_less_than(timestamp_ee_rqd) \
    .synchronize_time(timestamp_fgc_rqd) \
    .convert_index_to_sec().dfs[0]


timestamp_ee_ref_rqd = Reference.get_power_converter_reference_fpa('RQ', 'RQD.A12', 'eePm')
timestamp_ee_ref_rqd = Time.to_unix_timestamp(timestamp_ee_ref_rqd)

timestamp_fgc_ref_rqd = Reference.get_power_converter_reference_fpa('RQ', 'RQD.A12', 'fgcPm')
timestamp_fgc_ref_rqd = Time.to_unix_timestamp(timestamp_fgc_ref_rqd)


t_res_ref_df = QueryBuilder().with_pm() \
    .with_timestamp(timestamp_ee_ref_rqd) \
    .with_circuit_type('RQ') \
    .with_metadata(circuit_name='RQD.A12', system='EE', signal=signal_names).signal_query() \
    .remove_values_for_time_less_than(timestamp_ee_ref_rqd) \
    .synchronize_time(timestamp_fgc_ref_rqd) \
    .convert_index_to_sec().dfs[0]

AssertionBuilder().with_signal([t_res_df])\
    .compare_to_reference(signal_ref_dfs=[t_res_ref_df], abs_margin=25, scaling=1)
Out[106]:
<lhcsmapi.pyedsl.AssertionBuilder.AssertionBuilderSignalPlot at 0x7f8474314828>
 
 
  • has_min_max_variation()
In [111]:
from lhcsmapi.pyedsl.QueryBuilder import QueryBuilder
from lhcsmapi.pyedsl.AssertionBuilder import AssertionBuilder
from lhcsmapi.analysis.CircuitAnalysis import get_current_plateau_start_end

t_start = '2014-12-13 09:12:41+01:00'
t_end = '2014-12-13 12:27:11+01:00'

cv891_dfs = QueryBuilder().with_nxcals(spark) \
    .with_duration(t_start=t_start, t_end=t_end) \
    .with_circuit_type('RB') \
    .with_metadata(circuit_name='RB.A12', system=['LEADS_EVEN', 'LEADS_ODD'], signal='CV891') \
    .signal_query() \
    .synchronize_time() \
    .convert_index_to_sec() \
    .filter_median() \
    .dfs

i_meas_raw_df = QueryBuilder().with_nxcals(spark) \
    .with_duration(t_start=t_start, t_end=t_end) \
    .with_circuit_type('RB') \
    .with_metadata(circuit_name='RB.A12', system='PC', signal='I_MEAS') \
    .signal_query() \
    .dfs[0]

plateau_timing_df = get_current_plateau_start_end(i_meas_raw_df, i_meas_threshold=500)

AssertionBuilder().with_signal(cv891_dfs) \
    .with_time_range(t_start=plateau_timing_df['plateau_start_sync'], t_end=plateau_timing_df['plateau_end_sync']) \
    .has_min_max_variation(variation_min_max=8)
Out[111]:
<lhcsmapi.pyedsl.AssertionBuilder.AssertionBuilderSignalPlot at 0x7f846fd924a8>
 
 

In the case below, the variation is too tight and the assertion fails

In [112]:
AssertionBuilder().with_signal(cv891_dfs) \
    .with_time_range(t_start=plateau_timing_df['plateau_start_sync'], t_end=plateau_timing_df['plateau_end_sync']) \
    .has_min_max_variation(variation_min_max=1)
 
/eos/home-m/mmacieje/SWAN_projects/lhc-sm-api/lhcsmapi/pyedsl/AssertionBuilder.py:219: UserWarning: The variation of DACA05_07L2_CV891.POSST (1.0 %) exceeds 1.5 % for constant current from 806.0 to 5882.5 s
  .format(col, variation_min_max, variation.values[0], t_s, t_e))
/eos/home-m/mmacieje/SWAN_projects/lhc-sm-api/lhcsmapi/pyedsl/AssertionBuilder.py:219: UserWarning: The variation of DACA06_07L2_CV891.POSST (1.0 %) exceeds 1.9000000000000021 % for constant current from 806.0 to 5882.5 s
  .format(col, variation_min_max, variation.values[0], t_s, t_e))
/eos/home-m/mmacieje/SWAN_projects/lhc-sm-api/lhcsmapi/pyedsl/AssertionBuilder.py:219: UserWarning: The variation of DABA01_07R1_CV891.POSST (1.0 %) exceeds 1.6999999999999993 % for constant current from 806.0 to 5882.5 s
  .format(col, variation_min_max, variation.values[0], t_s, t_e))
/eos/home-m/mmacieje/SWAN_projects/lhc-sm-api/lhcsmapi/pyedsl/AssertionBuilder.py:219: UserWarning: The variation of DABA01_07R1_CV891.POSST (1.0 %) exceeds 1.1000000000000014 % for constant current from 6199.0 to 9745.0 s
  .format(col, variation_min_max, variation.values[0], t_s, t_e))
/eos/home-m/mmacieje/SWAN_projects/lhc-sm-api/lhcsmapi/pyedsl/AssertionBuilder.py:219: UserWarning: The variation of DABA02_07R1_CV891.POSST (1.0 %) exceeds 2.099999999999998 % for constant current from 806.0 to 5882.5 s
  .format(col, variation_min_max, variation.values[0], t_s, t_e))
/eos/home-m/mmacieje/SWAN_projects/lhc-sm-api/lhcsmapi/pyedsl/AssertionBuilder.py:219: UserWarning: The variation of DABA02_07R1_CV891.POSST (1.0 %) exceeds 1.2000000000000028 % for constant current from 6199.0 to 9745.0 s
  .format(col, variation_min_max, variation.values[0], t_s, t_e))
Out[112]:
<lhcsmapi.pyedsl.AssertionBuilder.AssertionBuilderSignalPlot at 0x7f84742bbdd8>
 
 
  • has_min_max_slope()
In [114]:
from lhcsmapi.pyedsl.QueryBuilder import QueryBuilder
from lhcsmapi.pyedsl.AssertionBuilder import AssertionBuilder
from lhcsmapi.analysis.CircuitAnalysis import get_current_plateau_start_end

t_start = '2014-12-13 09:12:41+01:00'
t_end = '2014-12-13 12:27:11+01:00'

u_res_dfs = QueryBuilder().with_nxcals(spark) \
    .with_duration(t_start=t_start, t_end=t_end) \
    .with_circuit_type('RB') \
    .with_metadata(circuit_name='RB.A12', system=['LEADS_EVEN', 'LEADS_ODD'], signal='U_RES') \
    .signal_query() \
    .synchronize_time() \
    .convert_index_to_sec() \
    .filter_median() \
    .dfs

i_meas_raw_df = QueryBuilder().with_nxcals(spark) \
    .with_duration(t_start=t_start, t_end=t_end) \
    .with_circuit_type('RB') \
    .with_metadata(circuit_name='RB.A12', system='PC', signal='I_MEAS') \
    .signal_query() \
    .dfs[0]

plateau_timing_df = get_current_plateau_start_end(i_meas_raw_df, i_meas_threshold=500)

AssertionBuilder().with_signal(u_res_dfs) \
    .with_time_range(t_start=plateau_timing_df['plateau_start_sync'], t_end=plateau_timing_df['plateau_end_sync']) \
    .has_min_max_slope(slope_min=-2, slope_max=2)
Out[114]:
<lhcsmapi.pyedsl.AssertionBuilder.AssertionBuilderSignalPlot at 0x7f84745c6668>
 
 

In the case below, the slope is too tight and the assertion fails

In [116]:
AssertionBuilder().with_signal(u_res_dfs) \
    .with_time_range(t_start=plateau_timing_df['plateau_start_sync'], t_end=plateau_timing_df['plateau_end_sync']) \
    .has_min_max_slope(slope_min=-2e-3, slope_max=2e-3)
 
/eos/home-m/mmacieje/SWAN_projects/lhc-sm-api/lhcsmapi/pyedsl/AssertionBuilder.py:240: UserWarning: The drift of DFLAS.7L2.RB.A12.LD1:U_RES is -0.119 mV/h for constant current from 806.0 to 5882.5 s
  .format(col, slope, unit, t_s, t_e))
/eos/home-m/mmacieje/SWAN_projects/lhc-sm-api/lhcsmapi/pyedsl/AssertionBuilder.py:240: UserWarning: The drift of DFLAS.7L2.RB.A12.LD1:U_RES is -0.667 mV/h for constant current from 6199.0 to 9745.0 s
  .format(col, slope, unit, t_s, t_e))
/eos/home-m/mmacieje/SWAN_projects/lhc-sm-api/lhcsmapi/pyedsl/AssertionBuilder.py:240: UserWarning: The drift of DFLAS.7L2.RB.A12.LD2:U_RES is 0.159 mV/h for constant current from 806.0 to 5882.5 s
  .format(col, slope, unit, t_s, t_e))
/eos/home-m/mmacieje/SWAN_projects/lhc-sm-api/lhcsmapi/pyedsl/AssertionBuilder.py:240: UserWarning: The drift of DFLAS.7L2.RB.A12.LD2:U_RES is 0.636 mV/h for constant current from 6199.0 to 9745.0 s
  .format(col, slope, unit, t_s, t_e))
/eos/home-m/mmacieje/SWAN_projects/lhc-sm-api/lhcsmapi/pyedsl/AssertionBuilder.py:240: UserWarning: The drift of DFLAS.7R1.RB.A12.LD3:U_RES is 0.205 mV/h for constant current from 806.0 to 5882.5 s
  .format(col, slope, unit, t_s, t_e))
/eos/home-m/mmacieje/SWAN_projects/lhc-sm-api/lhcsmapi/pyedsl/AssertionBuilder.py:240: UserWarning: The drift of DFLAS.7R1.RB.A12.LD3:U_RES is 0.421 mV/h for constant current from 6199.0 to 9745.0 s
  .format(col, slope, unit, t_s, t_e))
/eos/home-m/mmacieje/SWAN_projects/lhc-sm-api/lhcsmapi/pyedsl/AssertionBuilder.py:240: UserWarning: The drift of DFLAS.7R1.RB.A12.LD4:U_RES is -0.188 mV/h for constant current from 806.0 to 5882.5 s
  .format(col, slope, unit, t_s, t_e))
/eos/home-m/mmacieje/SWAN_projects/lhc-sm-api/lhcsmapi/pyedsl/AssertionBuilder.py:240: UserWarning: The drift of DFLAS.7R1.RB.A12.LD4:U_RES is -0.469 mV/h for constant current from 6199.0 to 9745.0 s
  .format(col, slope, unit, t_s, t_e))
Out[116]:
<lhcsmapi.pyedsl.AssertionBuilder.AssertionBuilderSignalPlot at 0x7f846fd65400>
 
 

3. FeatureBuilder()

{SIGNALS}.(FEATURE_CALCULATION).{ASSERTION}

e.g.

FeatureBuilder().with_signal(u_hds_dfs) \
                .calculate_features(features=['first', 'last20mean', 'tau_charge'], index=1544622149599000000)

Supported functions are:

['first', 'first20mean', 'last', 'last20mean', 'max', 'min', 'median', 'std', 'mean', 'tau_charge', 'tau_energy', 'tau_lin_reg', 'tau_exp_fit']

For example, to calculate initial voltage, final mean voltage based on the last 20 points, and the characteristic time of the pseudo-exponential decay for a quench heater voltage.

In [117]:
from lhcsmapi.pyedsl.QueryBuilder import QueryBuilder
from lhcsmapi.pyedsl.FeatureBuilder import FeatureBuilder
import matplotlib.pyplot as plt

timestamp = 1544622149599000000

u_hds_dfs = QueryBuilder().with_pm() \
    .with_timestamp(timestamp) \
    .with_circuit_type('RQ') \
    .with_metadata(circuit_name='RQD.A12', system='QH', signal='U_HDS', source='16L2', wildcard={'CELL': '16L2'}) \
    .signal_query()\
    .synchronize_time(timestamp)\
    .convert_index_to_sec().dfs

ax = u_hds_dfs[0].plot(figsize=(15,7))
u_hds_dfs[1].plot(ax=ax, grid=True)
plt.show()

FeatureBuilder().with_signal(u_hds_dfs) \
                .calculate_features(features=['first', 'last20mean', 'tau_charge'], index=1544622149599000000)
 
Out[117]:
  16L2:U_HDS_1:first 16L2:U_HDS_1:last20mean 16L2:U_HDS_1:tau_charge 16L2:U_HDS_2:first 16L2:U_HDS_2:last20mean 16L2:U_HDS_2:tau_charge
1544622149599000000 880.4621 5.575086 0.07779 872.8354 6.734703 0.076704