All posts by ahashimoto

撮影時のフレーム番号と動画のフレーム番号の違いについて

サーバの制約により,ディスクスペースを節約する必要があったため,KUSK Databaseでは可視光カメラとKinect カメラにおいて,動きのないフレームは保存していません.

このため,撮影時の実際のフレーム番号とビデオファイルのフレームの番号が一致していません.
CSVファイルでタイムスタンプの前に記述されている数字は,実際のフレーム番号を表しています.

従って,例えばn行目の番号がmであれば,ビデオファイルのnフレーム目が実際にはmフレーム目であることを示しています.

なお,動きの有無の判定に関するアルゴリズムの詳細は下記の文献に記述しています.
A. Hashimoto et al, “KUSK Dataset: Toward a Direct Understanding of Recipe Text and Human Cooking Activity,”  CEA2014

利用規約(英語のみ)

Kyoto University Dataset(s) Release Agreement

1st October 2014
updated on 27th November 2020
Introduction

Introduction
The goal of the Kyoto University Smart Kitchen (KUSK) dataset(s) is to develop new techniques, technology, and algorithms for cooking image processing such as Human Activity Recognition, Ingredient Recognition, and Food Tracking. Kyoto University has copyright on the data and is the principal distributor of the KUSK dataset(s). The dataset is meant to aid research efforts in the general area of developing, testing and evaluating algorithms for cooking image processing.
Release of the dataset
To advance the state-of-the-art in various cooking image processing technology, this dataset is distributed to the researcher community. All further uses of the dataset will be considered on the case- by-case bases. To obtain the right to access the full dataset, the requestor must sign this document and agree to observe the restrictions listed below.
Consent: The researcher(s) agrees to the following restrictions on the KUSK dataset(s):

  1. Redistribution
    Without prior written approval from Kyoto University, the KUSK dataset(s), in whole or in part, will not be further distributed, published, copied or disseminated in any way or form whatsoever, whether for profit or not. This includes further distributing, copying or disseminating to a different facility or organizational unit in the requesting university, organization, or company.
  2. Modification and Commercial Use
    Without prior approval from Kyoto University, the KUSK dataset(s), in whole or in part, may not be changed or used for commercial purposes.
  3. Requests for the KUSK dataset
    All requests for the KUSK dataset(s) will be forwarded to Kyoto University Principal Investigator(s).
  4. Publication Requirements
    In no case should the still frames are used in any way that could cause the original subject embarrassment or mental anguish. Data of human subjects is provided in coded form (without personal identifying information). Subject consent permits publication (paper or web-based) of the data (including image data) for scientific purposes only.
  5. Citation/Reference
    All documents and papers that report on research that uses the KUSK dataset(s) will acknowledge the use of the dataset by including an appropriate citation to the following.

    • KUSK Food: A. Hashimoto et al, “Recognizing ingredients at cutting process by integrating multimodal features,” ACM Multimedia 2012 workshop on Multimedia for Cooking and Eating Activities
    • KUSK 2013IJ: A. Hashimoto et al, “How Does User’s Access to Object Make HCI Smooth in Recipe Guidance?” 6th International Conf. on Cross-Cultural Design, Held as Part of HCI International 2014
    • KUSK 2014RC: A. Hashimoto et al, “KUSK Dataset: Toward a Direct Understanding of Recipe Text and Human Cooking Activity,” Workshop on Smart Technology for Cooking and Eating Activities,” 2014
  6. Publications to Kyoto University
    A copy of all reports and papers that are for public or general release that use the KUSK dataset(s) are appreciated to forward upon release or publication to Kyoto University Principal Investigator(s).
  7. No Warranty
    THE PROVIDER OF THE DATA MAKES NO REPRESENTATIONS AND EXTENDS NO WARRANTIES OF ANY KIND, EITHER EXPRESSED OR IMPLIED.
    THERE ARE NO EXPRESSED OR IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE, OR THAT THE USE OF THE MATERIAL WILL NOT INFRINGE ANY PATENT, COPYRIGHT, TRADEMARK, OR OTHER PROPRIETARY RIGHTS.

KUSK dataset(s) Principal Investigator(s) Large-Scale Text Archive Laboratory Academic Center for Computing and Media Studies, Kyoto University. Yoshida Nihonmatsu-cho, Sakyo-ku, Kyoto, 606-8501, JAPAN

Data Format -CHIFFON Log-

About CHIFFON

Check Here

Log Format

Example

[2014-05-13 20:27:31 UTC] [INFO] [Chiffon::Web::Navigator 97] CHECK(guest-2014.05.13_19.56.10.308258): substep03

[timepstamp][log level][Internal Info] Event Name (Session ID) Event Parameters etc.

Events

Events self-annotated by Subjects

NAVI_MENU,CHANNEL,CHECK,PLAY_CONTROL,END

Events from wizard or any external recognizer

EXTERNAL_INPUT
* event parameter is a command thrown by the wizard or the external recognizer.
* basic commands are shown here.

Other Events

START

Note

* all the events logged in KUSK2014RC are annotated by subjects.
* KUSK2013IJ include both annotation by subjects and wizard.

Data Format -Load Sensor-

 Sensor

HBM PW6D

 

Sensor Location

load_table.csv : now preparing

load_stove.csv :  now preparing

see also this paper to estimate center of gravity.

(A. Schmidt et al, “Context Acquisition Based on Load Sensing,” Ubicomp 2002 )

Data Logger

File Format

.csv (comma separated values)

(compressed by 7zip)

Data Header (table, stove)

1st column: unit ID for each data. (400 => Kg.)

2nd column: offset values.

3rd column: Start Time

Data Alignment of a Column (table, stove)

  1. Elapsed Time (in seconds)
  2. Left Top (Kg.)
  3. Right Top (Kg.)
  4. Left Bottom (Kg.)
  5. Right Bottom (Kg.)

Data Format -Thermal Camera-

 

Sensor

ARTCAM-320-THERMO-HYBRID

File Format

Raw Data

32bit float

.7z (7zip)
* The png files are saved as 4channel 8bit images. Hence, you need to load the png files by following code, for example.

cv::Mat read 	= cv::imread("temp.png", CV_LOAD_IMAGE_UNCHANGED);
cv::Mat readImage	= cv::Mat(read.rows, read.cols, CV_32FC1, read.data);
Visualized Data

Extension .mp4

Codec:  H265 (not H264!)

Audio: none

This data image visualized temperature by assigning min. and max. values to blue and red dynamically during the observation. If you need a time sequence of absolute temperature values, use Raw Data provided by 7z.

Decoding Command Sample
ffmpeg -i downloaded_movie.mp4 -c:v png ./output_dir/%7d.png

 

Encoding Command at compression

ffmpeg  -i  input_%07d.png -c:v hevc -pix_fmt yuv420p -strict experimental -preset veryslow -qp 0 output.mp4

*Currently, not many video player support H265 codec. You should convert H265 video to any other codec you prefer by ffmpeg.

Data Format -RGB-D Camera-

Sensor

Kinect for Xbox 360

File Format

RGB

Extension .mp4

Codec:  H265 (not H264!)

Audio: none

Caution

In some datasets, no motion video frames are dropped for saving the server storage.
If you intend to process a time sensitive process, please check running frame number treatment. 2014RC has also a problem of jitter in frame rate. (This problem is limited to the data with 10fps videos.)

Decode/Encode commands

Decoding Command Sample
ffmpeg -i downloaded_movie.mp4 -c:v png ./output_dir/%7d.png
Encoding Command at compression
ffmpeg  -i  input_color_%07d.png -c:v hevc -pix_fmt yuv420p -strict experimental -preset veryslow -qp 0 output.mp4

*Currently, not many video player support H265 codec. You should convert H265 video to any other codec you prefer by ffmpeg.

Depth

Extension .mov

Codec:  ffv1 (lossless)

Audio: none

Decoding Command Sample
ffmpeg -i downloaded_movie.mp4 -c:v png ./output_dir/%7d.png

* Caution: because the depth data are 16bit grayscale, jpg, bmp or any other video codes will loose the higher bits.

Encoding Command at compression
ffmpeg -i  input_depth_%07d.png\" -c:v ffv1 -pix_fmt gray16le output.mov
Time Stamp & Bone Track

See also Microsoft Document for Skelton Frame.

  1. Frame Running Number
  2. time stamp
  3. skeleton.dwFrameNumber
  4. skeleton.dwFlags
  5. skeleton.vFloorClipPlane (x:y:z) (separator ‘:’)
  6. skeleton.vNormalToGravity(x:y:z)
  7. SKELTON_DATA 1 (if detected)
  8. SKELTON_DATA 2 (if detected)
SKELTON_DATA (sepalator ‘|’)

See also Microsoft Document for Skelton Data.

  1. data.eTrackingState
  2. data.dwTrackingID
  3. data.dwEnrollmentIndex
  4. data.dwUserIndex
  5. data.Position (x:y:z)
  6. data.SkeletonPositions[0] (x:y:z)


26.  data.SkeletonPositions[20] (x:y:z)

Data Format -Optical Camera-

Sensor

Point Grey Research FL3-U3-32S2C-CS

File Format

Extension .mp4

Codec:  H265 (not H264!)

Audio: none

Caution

In some datasets, no motion video frames are dropped for saving the server storage.
If you intend to process a time sensitive process, please check running frame number treatment. 2014RC has also a problem of jitter in frame rate. (This problem is limited to the data with 10fps videos.)

Decode/Encode commands

Decoding Command Sample
ffmpeg -i downloaded_movie.mp4 -c:v png ./output_dir/%7d.png

Encoding Command at compression
ffmpeg  -i  input_%07d.png -c:v hevc -pix_fmt yuv420p -strict experimental -preset veryslow -qp 0 output.mp4

*Currently, not many video player support H265 codec. You should decode the H265 video to any other codec (or png files as the above decoding command sample).

Time Stamp
  1. Frame Running Number
  2. time stamp

Sensor

Point Grey Research FL3-U3-32S2C-CS

File Format

Extension .mp4

Codec:  H265 (not H264!)
Audio: none

Command:
ffmpeg  -i  input_%07d.png -c:v hevc -pix_fmt yuv420p -strict experimental -preset veryslow -qp 0 output.mp4